Summary
Optimization idea: during RK solver stages, only evaluate blocks that are upstream of dynamic blocks (blocks with solver engines). Downstream-only blocks (Scopes, output chains) would be evaluated once after the step is accepted, not at every intermediate stage.
Expected Gain
- 5-15% for typical models (10-50 blocks)
- 20-30% for models with many observation points or post-processing chains
- 0% for models where all blocks are in feedback loops
Approach
Reverse reachability analysis from engine blocks (Integrator, ODE, StateSpace, etc.) through the DAG and algebraic loops. Blocks that are NOT reachable are classified as "post" and skipped during RK stages.
Classification table
| Block type |
In loop? |
Upstream of engine? |
Per RK stage |
After step |
| Engine block |
- |
(is engine) |
step() |
sample() |
| Algebraic |
No |
Yes |
update() |
- |
| Algebraic |
No |
No |
- |
update() |
| Algebraic |
Yes (loop upstream of engine) |
Yes |
FPI solve per stage |
- |
| Algebraic |
Yes (loop not upstream) |
No |
- |
FPI solve once |
Key distinction
The Graph currently classifies blocks as alg_blocks (len > 0) and dyn_blocks (len == 0). But dyn_blocks includes Constants, Sources, Scopes — not just engine blocks. The stage split needs a third category: engine_blocks (blocks with actual solver state).
Learnings from prototype
A prototype was attempted on branch feature/dag-stage-split (now deleted). Key findings:
-
Graph::new needs has_engine flag — (NodeId, bool) must become (NodeId, bool, bool) to distinguish algebraic vs dynamic vs engine blocks.
-
Reverse BFS must start from engine blocks only — starting from all dyn_blocks (which includes Constants/Scopes) makes everything reachable and defeats the optimization.
-
Loop propagation is tricky — if any block in a loop is upstream of an engine, the ENTIRE loop must be solved per stage, AND all blocks upstream of the loop become stage-critical too.
-
Connection classification — a connection is "dyn" if its target is dyn-reachable. Need to iterate connections per edge to classify.
-
Silent correctness bugs — if a block is wrongly classified as "post", the simulation produces subtly wrong results (wrong step count, wrong trajectory) without errors. This makes the optimization risky.
-
Subsystem interaction — Subsystems are treated as blocks by the outer simulation. If a Subsystem contains engine blocks, it has an engine itself and is automatically classified correctly. Internal optimization within the Subsystem would be Phase 2.
Decision
Deferred. The 10-15% gain does not justify the added complexity given that fastsim already achieves ~30ns/block/eval (competitive with compiled Simulink). The risk of silent correctness bugs from misclassification outweighs the performance benefit.
If revisited
- Add
has_engine to Graph::new signature
- Implement
classify_dyn_post() with reverse BFS from engine blocks
- Add
dag_dyn_iter() / dag_post_iter() to Graph
- Add
_update_dyn() to Simulation (used in RK stage loop and implicit solve only)
- Keep
_update() for post-step, event handling, and init/cleanup
- Add comprehensive tests that verify trajectory match for models with downstream-only blocks
Summary
Optimization idea: during RK solver stages, only evaluate blocks that are upstream of dynamic blocks (blocks with solver engines). Downstream-only blocks (Scopes, output chains) would be evaluated once after the step is accepted, not at every intermediate stage.
Expected Gain
Approach
Reverse reachability analysis from engine blocks (Integrator, ODE, StateSpace, etc.) through the DAG and algebraic loops. Blocks that are NOT reachable are classified as "post" and skipped during RK stages.
Classification table
Key distinction
The Graph currently classifies blocks as
alg_blocks(len > 0) anddyn_blocks(len == 0). Butdyn_blocksincludes Constants, Sources, Scopes — not just engine blocks. The stage split needs a third category:engine_blocks(blocks with actual solver state).Learnings from prototype
A prototype was attempted on branch
feature/dag-stage-split(now deleted). Key findings:Graph::new needs
has_engineflag —(NodeId, bool)must become(NodeId, bool, bool)to distinguish algebraic vs dynamic vs engine blocks.Reverse BFS must start from engine blocks only — starting from all
dyn_blocks(which includes Constants/Scopes) makes everything reachable and defeats the optimization.Loop propagation is tricky — if any block in a loop is upstream of an engine, the ENTIRE loop must be solved per stage, AND all blocks upstream of the loop become stage-critical too.
Connection classification — a connection is "dyn" if its target is dyn-reachable. Need to iterate connections per edge to classify.
Silent correctness bugs — if a block is wrongly classified as "post", the simulation produces subtly wrong results (wrong step count, wrong trajectory) without errors. This makes the optimization risky.
Subsystem interaction — Subsystems are treated as blocks by the outer simulation. If a Subsystem contains engine blocks, it has an engine itself and is automatically classified correctly. Internal optimization within the Subsystem would be Phase 2.
Decision
Deferred. The 10-15% gain does not justify the added complexity given that fastsim already achieves ~30ns/block/eval (competitive with compiled Simulink). The risk of silent correctness bugs from misclassification outweighs the performance benefit.
If revisited
has_enginetoGraph::newsignatureclassify_dyn_post()with reverse BFS from engine blocksdag_dyn_iter()/dag_post_iter()to Graph_update_dyn()to Simulation (used in RK stage loop and implicit solve only)_update()for post-step, event handling, and init/cleanup