Add: per-task and scope-filter granularity to scope_stats#976
Add: per-task and scope-filter granularity to scope_stats#976doraemonmj wants to merge 1 commit into
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR adds selective scope-site collection and per-task scope-stats record emission. Users can now filter scope-stats collection to specific PTO2_SCOPE line numbers via ChangesScope Stats Per-Task & Filtering Support
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces two optional refinements to the scope_stats profiling tool: a scope site filter (--enable-scope-stats <lines>) to restrict collection to specific line numbers, and per-task sampling (--scope-stats-task) to emit fine-grained occupancy records during task submission. It updates the HTML plotting tool to render a detailed task timeline, and propagates these new configurations through CallConfig, Python bindings, and the device-side collectors. One issue was identified in the test suite where test_scope_stats.py may skip validation if only --scope-stats-task is specified, because it does not account for the fact that this flag implies --enable-scope-stats.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/dfx/scope-stats.md (1)
55-61:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winFix the documented input path for
scope_stats_plot.py.Line 56 uses
<output_prefix>/scope_stats.jsonl, but the generated file is underscope_stats/(<output_prefix>/scope_stats/scope_stats.jsonl).Suggested fix
-python simpler_setup/tools/scope_stats_plot.py <output_prefix>/scope_stats.jsonl -# writes <output_prefix>/scope_stats.html +python simpler_setup/tools/scope_stats_plot.py <output_prefix>/scope_stats/scope_stats.jsonl +# writes <output_prefix>/scope_stats/scope_stats.html (unless --out-dir is provided)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/dfx/scope-stats.md` around lines 55 - 61, The docs show the wrong input path for the scope stats plotting script; update the examples referencing scope_stats_plot.py to point to the generated file under the scope_stats subdirectory (i.e., use <output_prefix>/scope_stats/scope_stats.jsonl instead of <output_prefix>/scope_stats.jsonl) and adjust the alternate example similarly (e.g., path/to/scope_stats/scope_stats.jsonl or path/to/scope_stats.jsonl within a scope_stats directory) so the README matches the actual output layout for scope_stats_plot.py.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_orchestrator.cpp`:
- Around line 684-694: The scope stats snapshot is taken too late (after
prepare_task/tensormap work and scheduler push) using
task_allocator.task_tail()/heap_tail(), so make the allocator snapshot
immediately after allocation and reuse it at submit-time: capture
alloc.task_tail(), alloc.task_head(), alloc.heap_tail(), alloc.heap_top() (and
orch->tensor_map.current_used()) into a small TaskAllocationSnapshot stored on
the allocated slot or returned by the allocator when the slot is reserved (e.g.
in the code path around prepare_task()/task_allocator allocation), and replace
the late calls in the is_scope_stats_task_enabled() block that calls
scope_stats_record_task with reads from that cached snapshot so the PHASE_TASK
record reflects the state at allocation time.
In `@src/a5/runtime/tensormap_and_ringbuffer/runtime/pto_orchestrator.cpp`:
- Around line 682-692: The current scope stats sample calls
task_tail()/heap_tail() late (in the submit path) so occupancy can change; move
sampling to immediately after allocation and reuse that cached snapshot here:
when the slot is allocated by orch->rings[ring_id].task_allocator (e.g., in the
allocation/prepare_task code path), read and store task_tail(), task_head(),
heap_tail(), heap_top(), and orch->tensor_map.current_used() into a small
Snapshot struct (or into fields on the task descriptor), and then change this
block to call scope_stats_record_task(...) with the cached snapshot instead of
calling task_tail()/heap_tail() live; keep the is_scope_stats_task_enabled()
guard and reference the same symbols (is_scope_stats_task_enabled(),
orch->rings[ring_id].task_allocator, scope_stats_record_task,
tensor_map.current_used()) so the recorded PHASE_TASK reflects the allocator
state at allocation time.
In `@src/common/task_interface/call_config.h`:
- Around line 61-67: The CSV buffer scope_stats_scope in call_config.h is too
small (64 bytes) to hold the advertised 16-filter CSV from scope_stats.h;
increase its size to match the 16-entry contract (e.g., at least 96 bytes, or
128 for headroom) so init_scope_stats can parse full input, and make the same
change for the other occurrence noted (the second scope_stats_scope
declaration). Update the declaration(s) of scope_stats_scope to the larger fixed
size and keep the comment about comma-separated PTO2_SCOPE line numbers in sync.
In `@tests/st/a2a3/tensormap_and_ringbuffer/dfx/scope_stats/test_scope_stats.py`:
- Around line 101-102: The test gate currently returns when
request.config.getoption("--enable-scope-stats", default=None) is None, but it
should also consider the presence of the related CLI flag "--scope-stats-task";
update the condition in the setup/guard so it only returns when both
request.config.getoption("--enable-scope-stats", default=None) is None AND
request.config.getoption("--scope-stats-task", default=None) is None, i.e.,
treat either option as enabling scope stats; locate the check using
request.config.getoption("--enable-scope-stats") in the test setup (around the
current if) and add the additional getoption("--scope-stats-task") check so
assertions run when either flag is provided.
---
Outside diff comments:
In `@docs/dfx/scope-stats.md`:
- Around line 55-61: The docs show the wrong input path for the scope stats
plotting script; update the examples referencing scope_stats_plot.py to point to
the generated file under the scope_stats subdirectory (i.e., use
<output_prefix>/scope_stats/scope_stats.jsonl instead of
<output_prefix>/scope_stats.jsonl) and adjust the alternate example similarly
(e.g., path/to/scope_stats/scope_stats.jsonl or path/to/scope_stats.jsonl within
a scope_stats directory) so the README matches the actual output layout for
scope_stats_plot.py.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 05113caa-c38d-46c4-8416-e09fa9259a8d
📒 Files selected for processing (26)
conftest.pydocs/dfx/scope-stats.mdpython/bindings/task_interface.cpppython/simpler/worker.pysimpler_setup/scene_test.pysimpler_setup/tools/scope_stats_plot.pysrc/a2a3/platform/onboard/host/device_runner.cppsrc/a2a3/platform/sim/host/device_runner.cppsrc/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_orchestrator.cppsrc/a5/platform/onboard/host/device_runner.cppsrc/a5/platform/sim/host/device_runner.cppsrc/a5/runtime/tensormap_and_ringbuffer/runtime/pto_orchestrator.cppsrc/common/platform/include/aicpu/scope_stats_collector_aicpu.hsrc/common/platform/include/common/scope_stats.hsrc/common/platform/include/host/scope_stats_collector.hsrc/common/platform/onboard/host/c_api_shared.cppsrc/common/platform/onboard/host/device_runner_base.hsrc/common/platform/shared/aicpu/scope_stats_collector_aicpu.cppsrc/common/platform/shared/host/scope_stats_collector.cppsrc/common/platform/sim/host/c_api_shared.cppsrc/common/platform/sim/host/device_runner_base.hsrc/common/task_interface/call_config.hsrc/common/worker/chip_worker.cppsrc/common/worker/chip_worker.hsrc/common/worker/pto_runtime_c_api.htests/st/a2a3/tensormap_and_ringbuffer/dfx/scope_stats/test_scope_stats.py
51036e5 to
784d1bb
Compare
Extend scope_stats (hw-native-sys#858) with two orthogonal, opt-in axes on top of the existing per-scope begin/end sampling (hw-native-sys#902): - Scope filter (--enable-scope-stats <lines>): restrict collection to the listed PTO2_SCOPE line numbers via CallConfig.scope_stats_scope. Empty = every scope (unchanged default). The orchestration is one source file, so a line uniquely identifies a scope. - Per-task sampling (--scope-stats-task): emit one phase="task" record per submit_task carrying task_id and ring/heap occupancy, attributed to the enclosing scope and subject to the scope filter. Both travel through CallConfig to the device collector. The runtime only calls the scope_stats_* interface (one weak-gated scope_stats_record_task in submit_task_common); the collector parses the filter CSV and gates record emission, leaving depth/site bookkeeping intact so nesting stays correct. Filtered-out scopes are not appended and do not count toward total. The implicit top-level scope now renders as "<root>" not "(unknown)". jsonl gains "phase" (begin/end/task) and "task_id" fields. Mirrored across a2a3/a5 and onboard/sim; docs/dfx/scope-stats.md updated.
784d1bb to
3476732
Compare
Extend scope_stats (#858) with two orthogonal, opt-in axes on top of the existing per-scope begin/end sampling (#902):
Both travel through CallConfig to the device collector. The runtime only calls the scope_stats_* interface (one weak-gated scope_stats_record_task in submit_task_common); the collector parses the filter CSV and gates record emission, leaving depth/site bookkeeping intact so nesting stays correct. Filtered-out scopes are not appended and do not count toward total. The implicit top-level scope now renders as "" not "(unknown)". jsonl gains "phase" (begin/end/task) and "task_id" fields.
Mirrored across a2a3/a5 and onboard/sim; docs/dfx/scope-stats.md updated.