Skip to content

Trainging mod implementation (WIP)#2663

Open
EvgeniiMekhanik wants to merge 6 commits into
masterfrom
MekhanikEvgenii/trainging-TMP-design
Open

Trainging mod implementation (WIP)#2663
EvgeniiMekhanik wants to merge 6 commits into
masterfrom
MekhanikEvgenii/trainging-TMP-design

Conversation

@EvgeniiMekhanik

Copy link
Copy Markdown
Contributor

No description provided.

@EvgeniiMekhanik EvgeniiMekhanik requested a review from const-t June 9, 2026 19:11
@EvgeniiMekhanik EvgeniiMekhanik changed the title Mekhanik evgenii/trainging tmp design Trainging mod implementation (WIP) Jun 9, 2026
@EvgeniiMekhanik EvgeniiMekhanik force-pushed the MekhanikEvgenii/trainging-TMP-design branch 15 times, most recently from f418c55 to b86628c Compare June 15, 2026 18:46

@const-t const-t left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that PR is WIP, but I have few comments for the future.

Comment thread fw/training.c Outdated
*/
if (likely(!tfw_mode_is_disabled())) {
s = rcu_dereference(g_stats);
percpu_counter_add(&s->sum, delta1);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What a reason to use percpu_counter instead of simple per-cpu var? percpu_counter pretty large and has overhead, must be a reason to use it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes fixed

Comment thread fw/training.h
@@ -0,0 +1,181 @@
/**

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest renaming this to adaptive_limits.c or similar and use word "training" only in sense of "training mode" as the state of the adaptive limits.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes fixed

Comment thread fw/client.h Outdated
atomic_long_t max;
s64 __percpu *counter;
u16 epoch;
} TfwClientCounter;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my point of view we should move this to training.h. All other related structs as well

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes fixed

Comment thread fw/client.c Outdated
}

static bool
tfw_client_counter_training_check(TfwClientCounter *counter,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems client.c not the right place for this function. I would prefer to have it in training.c

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread fw/client.c Outdated
return defence(curr);

if (tfw_client_counter_change_max(counter, curr, &delta1, &delta2))
adjust_num(delta1, delta2);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest moving update of the global stats to the tfw_http_conn_recv_finish(), we don't need live update of the counter during training

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@EvgeniiMekhanik EvgeniiMekhanik force-pushed the MekhanikEvgenii/trainging-TMP-design branch 12 times, most recently from 4b8f8f9 to 96e0ae8 Compare June 22, 2026 11:57
@EvgeniiMekhanik EvgeniiMekhanik force-pushed the MekhanikEvgenii/trainging-TMP-design branch 2 times, most recently from 4681521 to 40ac0a7 Compare June 22, 2026 14:47
@EvgeniiMekhanik EvgeniiMekhanik marked this pull request as draft June 22, 2026 14:48
@EvgeniiMekhanik EvgeniiMekhanik force-pushed the MekhanikEvgenii/trainging-TMP-design branch 3 times, most recently from e48e696 to cd3f102 Compare June 22, 2026 19:15
@EvgeniiMekhanik EvgeniiMekhanik marked this pull request as ready for review June 22, 2026 19:15
@EvgeniiMekhanik EvgeniiMekhanik force-pushed the MekhanikEvgenii/trainging-TMP-design branch from cd3f102 to 5f843e6 Compare June 23, 2026 10:51
@EvgeniiMekhanik EvgeniiMekhanik marked this pull request as draft June 25, 2026 21:22
Introduce library for 128 bit calculations which are
not spupported in linux kernel:
  - 128/32 division using bitwise long division
  - integer square root via binary search

Needed for training mode statistics collection where large numbers
of clients can cause 64-bit counter overflow during aggregation.
Add a generic training/defence subsystem used to detect abnormal
behavior based on z-score statistics.

The implementation provides:
  - training mode: collect per-event statistics (sum, sumsq, count)
    using percpu counters to minimize contention;
  - defence mode: evaluate incoming values against calculated mean/std
    and reject anomalies exceeding configured z-score threshold (drop
    connection with TCP RST);

Use adaptive limits (training/defence) library with per-client connection
tracking. Maintain current and maximum number of concurrent connections
per client and update statistic on each new maximum of concurrent
client connections. In defence mode calculate z-score for the
client on each new established connection and drop connection if
z-score exceeded configured threshold.
Use adaptive limits library for non-idempotent requests tracking (we account
only non-idempotent requests since they really block an upstream connection).
Implement new structure `TfwAdaptiveLimitLock` with per-cpu counter to
track current count of non-answered non-idempotent requests. In defence
mode in `tfw_http_conn_recv_finish` callback calculate z-score, compare
it with configured `threshold` and drop client connection if necessary.
Current approach with per-cpu request accounting prevent performance
degradation.
Add per-socket training_epoch field to track the training generation
for connection-related statistics. This allows associating socket
events with a specific training period and prevents mixing measurements
across training epochs when switching between TRAINING and DEFENCE modes.
Use adaptive limits library for client cpu usage tracking.
Use `TfwAdaptiveLimitLock` structure for cpu usage tracking.
We calculate time at the beginning of the `ss_tcp_process_data`,
then calculate time in the `conn_recv_finish`. Use delta time
for client cpu usage tracking.
Use training library for client memory usage tracking.
Use `TfwAdaptiveLimitLock` structure for client memory usage
tracking. In defence mode in `tfw_http_conn_recv_finish` callback
calculate z-score, compare it with configured `threshold` and drop
client connection if necessary (same as we do for non-idempotent
requests). Current approach with per-cpu request accounting prevent
performance degradation.
Pay attention that we also adjust memory usage in per-cpu `mem` storage
to check `soft` and `hard` mem limits. We should do it in other storage,
because we zero `TfwAdaptiveLimitLock` on the start of the new training
and do not account events from previous trainging in `TfwAdaptiveLimitLock`.

Performance measurements for the whole patchset were made and show no
measurable regression:

Training:
1262705 req/s
1272613 req/s
1264688 req/s

Defense:
1272456 req/s
1263205 req/s
1256504 req/s

Master:
1253438 req/s
1253207 req/s
1248473 req/s

Although training and defense modes appear slightly faster than
master, the difference is below 2% and falls within normal run-to-run
variation. No statistically significant performance impact was
observed.
@EvgeniiMekhanik EvgeniiMekhanik force-pushed the MekhanikEvgenii/trainging-TMP-design branch from 55c3eab to 127ad54 Compare June 26, 2026 11:40
@EvgeniiMekhanik EvgeniiMekhanik marked this pull request as ready for review June 26, 2026 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants