Skip to content

PS-11161 [8.0]: mem_root_deque performance optimizations (part 1).#5949

Open
dlenev wants to merge 1 commit into
percona:release-8.0.46-37from
dlenev:ps-11161-step-1-8.0
Open

PS-11161 [8.0]: mem_root_deque performance optimizations (part 1).#5949
dlenev wants to merge 1 commit into
percona:release-8.0.46-37from
dlenev:ps-11161-step-1-8.0

Conversation

@dlenev
Copy link
Copy Markdown
Contributor

@dlenev dlenev commented May 15, 2026

Reduce target size of mem_root_deque block to 256 bytes from 1Kb.

In the majority of cases mem_root_deque is used to store pointers (e.g. to Item objects), hence the 1kB block can store up to 128 elements for 64-bit arch.
OTOH in many cases the number of elements which is really used in mem_root_deque is much lower. For example, for many queries number of fields in SELECT list is far smaller than 64.

This means that in many cases the bigger part of such a 1kB block is just wasted. Taking into account that there are 5-10 of mem_root_deque instances even for fairly simple queries the effect of such waste becomes more pronounced - allocating these extra unnecessary 5-10Kb on main MEM_ROOT can trigger it to request another block from malloc(). And the latter has small but visible impact on performance in some of sysbench tests.

@dlenev dlenev requested review from inikep and percona-ysorokin May 15, 2026 11:46
@dlenev dlenev marked this pull request as draft May 15, 2026 11:58
Reduce target size of mem_root_deque block to 256 bytes from 1Kb.

In the majority of cases mem_root_deque is used to store pointers
(e.g. to Item objects), hence the 1kB block can store up to 128
elements for 64-bit arch.
OTOH in many cases the number of elements which is really used in
mem_root_deque is much lower. For example, for many queries number
of fields in SELECT list is far smaller than 64.

This means that in many cases the bigger part of such a 1kB block
is just wasted. Taking into account that there are 5-10 of
mem_root_deque instances even for fairly simple queries the effect
of such waste becomes more pronounced - allocating these extra
unnecessary 5-10Kb on main MEM_ROOT can trigger it to request
another block from malloc(). And the latter has small but visible
impact on performance in some of sysbench tests.
@dlenev dlenev force-pushed the ps-11161-step-1-8.0 branch from 23cbdb9 to f1da129 Compare May 15, 2026 14:55
@dlenev dlenev marked this pull request as ready for review May 15, 2026 14:56
@dlenev
Copy link
Copy Markdown
Contributor Author

dlenev commented May 15, 2026

Here are Jenkins results for this change: https://ps80.cd.percona.com/view/8.0%20+%208.4%20parallel%20MTR/job/percona-server-8.0-pipeline-parallel-mtr/1593/
They look OK to me, as both failures observed can be sporadically seen in release-8.0.46-37 branch prior to this change.

Copy link
Copy Markdown
Collaborator

@percona-ysorokin percona-ysorokin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants