Trainging mod implementation (WIP) by EvgeniiMekhanik · Pull Request #2663 · tempesta-tech/tempesta

EvgeniiMekhanik · 2026-06-09T19:11:11Z

No description provided.

const-t

I see that PR is WIP, but I have few comments for the future.

const-t · 2026-06-12T14:17:27Z

+	 */
+	if (likely(!tfw_mode_is_disabled())) {
+		s = rcu_dereference(g_stats);
+		percpu_counter_add(&s->sum, delta1);


What a reason to use percpu_counter instead of simple per-cpu var? percpu_counter pretty large and has overhead, must be a reason to use it.

const-t · 2026-06-15T14:50:28Z

@@ -0,0 +1,181 @@
+/**


I suggest renaming this to adaptive_limits.c or similar and use word "training" only in sense of "training mode" as the state of the adaptive limits.

const-t · 2026-06-15T14:52:04Z

+	atomic_long_t		max;
+	s64 	__percpu	*counter;
+	u16			epoch;
+} TfwClientCounter;


From my point of view we should move this to training.h. All other related structs as well

const-t · 2026-06-16T09:22:09Z

+}
+
+static bool
+tfw_client_counter_training_check(TfwClientCounter *counter,


It seems client.c not the right place for this function. I would prefer to have it in training.c

const-t · 2026-06-16T09:33:17Z

+		return defence(curr);
+
+	if (tfw_client_counter_change_max(counter, curr, &delta1, &delta2))
+		adjust_num(delta1, delta2);


I would suggest moving update of the global stats to the tfw_http_conn_recv_finish(), we don't need live update of the counter during training

Introduce library for 128 bit calculations which are not spupported in linux kernel: - 128/32 division using bitwise long division - integer square root via binary search Needed for training mode statistics collection where large numbers of clients can cause 64-bit counter overflow during aggregation.

Add a generic training/defence subsystem used to detect abnormal behavior based on z-score statistics. The implementation provides: - training mode: collect per-event statistics (sum, sumsq, count) using percpu counters to minimize contention; - defence mode: evaluate incoming values against calculated mean/std and reject anomalies exceeding configured z-score threshold (drop connection with TCP RST); Use adaptive limits (training/defence) library with per-client connection tracking. Maintain current and maximum number of concurrent connections per client and update statistic on each new maximum of concurrent client connections. In defence mode calculate z-score for the client on each new established connection and drop connection if z-score exceeded configured threshold.

Use adaptive limits library for non-idempotent requests tracking (we account only non-idempotent requests since they really block an upstream connection). Implement new structure `TfwAdaptiveLimitLock` with per-cpu counter to track current count of non-answered non-idempotent requests. In defence mode in `tfw_http_conn_recv_finish` callback calculate z-score, compare it with configured `threshold` and drop client connection if necessary. Current approach with per-cpu request accounting prevent performance degradation.

Add per-socket training_epoch field to track the training generation for connection-related statistics. This allows associating socket events with a specific training period and prevents mixing measurements across training epochs when switching between TRAINING and DEFENCE modes.

Use adaptive limits library for client cpu usage tracking. Use `TfwAdaptiveLimitLock` structure for cpu usage tracking. We calculate time at the beginning of the `ss_tcp_process_data`, then calculate time in the `conn_recv_finish`. Use delta time for client cpu usage tracking.

Use training library for client memory usage tracking. Use `TfwAdaptiveLimitLock` structure for client memory usage tracking. In defence mode in `tfw_http_conn_recv_finish` callback calculate z-score, compare it with configured `threshold` and drop client connection if necessary (same as we do for non-idempotent requests). Current approach with per-cpu request accounting prevent performance degradation. Pay attention that we also adjust memory usage in per-cpu `mem` storage to check `soft` and `hard` mem limits. We should do it in other storage, because we zero `TfwAdaptiveLimitLock` on the start of the new training and do not account events from previous trainging in `TfwAdaptiveLimitLock`. Performance measurements for the whole patchset were made and show no measurable regression: Training: 1262705 req/s 1272613 req/s 1264688 req/s Defense: 1272456 req/s 1263205 req/s 1256504 req/s Master: 1253438 req/s 1253207 req/s 1248473 req/s Although training and defense modes appear slightly faster than master, the difference is below 2% and falls within normal run-to-run variation. No statistically significant performance impact was observed.

EvgeniiMekhanik requested a review from const-t June 9, 2026 19:11

EvgeniiMekhanik changed the title ~~Mekhanik evgenii/trainging tmp design~~ Trainging mod implementation (WIP) Jun 9, 2026

EvgeniiMekhanik force-pushed the MekhanikEvgenii/trainging-TMP-design branch 15 times, most recently from f418c55 to b86628c Compare June 15, 2026 18:46

const-t reviewed Jun 16, 2026

View reviewed changes

EvgeniiMekhanik force-pushed the MekhanikEvgenii/trainging-TMP-design branch 12 times, most recently from 4b8f8f9 to 96e0ae8 Compare June 22, 2026 11:57

EvgeniiMekhanik force-pushed the MekhanikEvgenii/trainging-TMP-design branch 2 times, most recently from 4681521 to 40ac0a7 Compare June 22, 2026 14:47

EvgeniiMekhanik marked this pull request as draft June 22, 2026 14:48

EvgeniiMekhanik force-pushed the MekhanikEvgenii/trainging-TMP-design branch 3 times, most recently from e48e696 to cd3f102 Compare June 22, 2026 19:15

EvgeniiMekhanik marked this pull request as ready for review June 22, 2026 19:15

EvgeniiMekhanik force-pushed the MekhanikEvgenii/trainging-TMP-design branch from cd3f102 to 5f843e6 Compare June 23, 2026 10:51

EvgeniiMekhanik marked this pull request as draft June 25, 2026 21:22

EvgeniiMekhanik added 6 commits June 26, 2026 13:17

EvgeniiMekhanik force-pushed the MekhanikEvgenii/trainging-TMP-design branch from 55c3eab to 127ad54 Compare June 26, 2026 11:40

EvgeniiMekhanik marked this pull request as ready for review June 26, 2026 14:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Trainging mod implementation (WIP)#2663

Trainging mod implementation (WIP)#2663
EvgeniiMekhanik wants to merge 6 commits into
masterfrom
MekhanikEvgenii/trainging-TMP-design

EvgeniiMekhanik commented Jun 9, 2026

Uh oh!

const-t left a comment

Uh oh!

const-t Jun 12, 2026

Uh oh!

EvgeniiMekhanik Jun 18, 2026

Uh oh!

const-t Jun 15, 2026

Uh oh!

EvgeniiMekhanik Jun 18, 2026

Uh oh!

const-t Jun 15, 2026

Uh oh!

EvgeniiMekhanik Jun 18, 2026

Uh oh!

const-t Jun 16, 2026

Uh oh!

EvgeniiMekhanik Jun 18, 2026

Uh oh!

const-t Jun 16, 2026

Uh oh!

EvgeniiMekhanik Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

EvgeniiMekhanik commented Jun 9, 2026

Uh oh!

const-t left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants