-
-
Notifications
You must be signed in to change notification settings - Fork 52
feat: add system resource monitoring with TUI and install integration #664
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: add system resource monitoring with TUI and install integration #664
Conversation
CLA Verification PassedAll contributors have signed the CLA.
|
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughAdds a client-side monitoring subsystem (sampler, analyzer, exporter, storage, UI), a CLI Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI as CortexCLI
participant Sampler as ResourceSampler
participant UI as MonitorUI
participant Exporter as export_samples
User->>CLI: cortex monitor --duration 60 --export /path/out.json
CLI->>Sampler: instantiate & start()
activate Sampler
Sampler->>Sampler: _sample_loop (background)
CLI->>UI: create MonitorUI(sampler)
activate UI
loop every interval
UI->>Sampler: get_latest_sample()
Sampler-->>UI: ResourceSample
UI->>UI: render live bars & alerts
end
Note over UI,Sampler: duration elapsed or user interrupt
UI->>Sampler: stop()
deactivate Sampler
Sampler-->>UI: PeakUsage
UI->>UI: show_summary(peak_usage)
opt export provided
UI->>Exporter: export_samples(samples, path, peak)
Exporter->>Exporter: write JSON/CSV
end
CLI-->>User: exit code and peak summary
sequenceDiagram
participant User
participant CLI as CortexCLI
participant Coord as InstallationCoordinator
participant Sampler as ResourceSampler
participant Storage as MonitorStorage
User->>CLI: cortex install cuda --monitor
CLI->>Coord: execute(enable_monitoring=True)
activate Coord
Coord->>Sampler: create & start()
activate Sampler
Sampler->>Sampler: _sample_loop (background)
Coord->>Coord: run installation steps
Sampler->>Sampler: collect samples continuously
Note over Coord,Sampler: installation completes or fails
Coord->>Sampler: stop()
deactivate Sampler
Sampler-->>Coord: PeakUsage
Coord->>Storage: finalize_session(session_id, peak_usage, sample_count, metadata)
Coord->>Coord: populate InstallationResult with peak metrics
Coord-->>CLI: InstallationResult (contains peak_cpu/peak_ram_percent/peak_ram_gb)
CLI-->>User: print peak usage + exit code
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @Vaibhav701161, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a robust system resource monitoring capability to Cortex, allowing users to gain deep insights into their system's performance. It provides both a real-time interactive TUI for immediate observation and an integrated monitoring option during software installations to capture peak resource demands. The collected data is stored for historical review, can be exported for external analysis, and is intelligently analyzed to offer actionable recommendations, significantly enhancing the diagnostic and optimization toolkit for Cortex users. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This is a substantial and well-executed pull request that introduces a comprehensive system resource monitoring feature. The code is well-structured, with a clear separation of concerns into modules for sampling, UI, storage, exporting, and analysis. I appreciate the thoughtful design choices, such as the graceful degradation when optional dependencies like rich are unavailable, the thread-safe data collection in the sampler, and the fallback to a user-writable directory for the database. The addition of extensive unit tests is also a great strength. My review includes a few suggestions to address a missing type hint, a couple of opportunities for refactoring to reduce code duplication, and cleaning up some leftover development artifacts and a copy-paste error in the new tests. Overall, this is a high-quality contribution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Vaibhav701161 Thanks for contributing to Cortex Linux.
Please open a separate PR for CLA verification, following the pattern used in #401.
Also, kindly address all bot comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cortex/cli.py (1)
1463-1470: Fixmonitorplumbing to avoid TypeError/NameError
cli.install(..., monitor=...)is called (Line 5611), butinstall()doesn’t acceptmonitor, andmonitoris referenced later (Line 1694). This will raiseTypeError(unexpected kwarg) orNameErrorat runtime. Add amonitorparameter and wire it through.🛠️ Proposed fix
- def install( - self, - software: str, - execute: bool = False, - dry_run: bool = False, - parallel: bool = False, - json_output: bool = False, - ): + def install( + self, + software: str, + execute: bool = False, + dry_run: bool = False, + parallel: bool = False, + json_output: bool = False, + monitor: bool = False, + ) -> int: @@ coordinator = InstallationCoordinator( commands=commands, descriptions=[f"Step {i + 1}" for i in range(len(commands))], timeout=300, stop_on_error=True, progress_callback=progress_callback, enable_monitoring=monitor, )Also applies to: 1694-1706, 5611-5616
🤖 Fix all issues with AI agents
In `@cortex/coordinator.py`:
- Around line 245-255: The code logs "Resource monitoring started" regardless of
whether ResourceSampler actually began; modify the try block that creates
self._sampler (ResourceSampler, self._sampler.start()) to check the sampler's
running state (e.g., call self._sampler.is_running or equivalent) after start(),
only log "Resource monitoring started" when it reports running, and if it is not
running set self._sampler = None to avoid misleading logs and zero peaks; ensure
ImportError handling remains unchanged.
- Around line 67-77: The Coordinator class sets monitoring attributes without
type hints and the from_plan factory doesn't accept/forward enable_monitoring;
add explicit type annotations for self._sampler and self._peak_usage (e.g.,
import Optional and Any from typing and declare self._sampler: Optional[Any] =
None and self._peak_usage: Optional[Any] = None) and update the from_plan method
signature to include enable_monitoring: bool = False, then pass that parameter
through to the Coordinator constructor when instantiating (ensure the call to
Coordinator(...) includes enable_monitoring=enable_monitoring).
In `@cortex/monitor/exporter.py`:
- Around line 104-157: The _export_csv function accepts a metadata dict but
never writes it to the CSV; update _export_csv to iterate over the metadata
argument (when not None) and write each key/value as a comment line (e.g., "#
key: value") to the opened file before writing the header and samples, similar
to how the JSON exporter preserves metadata; locate the metadata parameter on
_export_csv and the with open(...) block to insert the metadata comment writes
so existing fields and writer.writeheader() remain unchanged.
In `@cortex/monitor/monitor_ui.py`:
- Around line 11-17: The success message in export_samples() currently prints
the original export_path even though the code appends ".json" when no suffix is
provided; change export_samples() to resolve the final path up front (e.g.,
compute final_export_path from the given export_path by adding ".json" when
Path(export_path).suffix is empty), use final_export_path for the file
write/serialization and for the success print/log, and apply the same fix to the
similar export block around the other occurrence (the block referenced at lines
~331-339) so both use the resolved final path in I/O and messaging.
- Around line 221-257: The _create_bar method currently always uses RAM
thresholds and allows percent >100 to overflow the bar; update it to (1)
determine the correct threshold set based on the metric label (e.g., map label
or introduce/inspect a metric key like "cpu"/"ram"/"disk" to pick
self.thresholds.cpu_warning/critical, self.thresholds.disk_warning/critical,
etc.) instead of always using self.thresholds.ram_warning/critical, and (2)
clamp the incoming percent value to the 0–100 range before computing filled =
int((percent / 100) * BAR_WIDTH) so the bar never overruns; keep the rest of the
UI logic (append label, bar, percent, cores/suffix) unchanged but use the
clamped percent variable and the chosen warning/critical thresholds to set
color.
- Around line 74-86: The run method currently only checks RICH_AVAILABLE before
starting the Live UI; add a TTY check and fall back when output is not a
terminal by returning self._run_fallback(duration). Concretely, import sys and
modify run(self, duration: int | None = None) to first check if not
RICH_AVAILABLE or not sys.stdout.isatty() and in that case call and return
self._run_fallback(duration); keep the rest of the Live-based logic unchanged so
Live is only used when Rich is available and stdout is a TTY.
In `@cortex/monitor/sampler.py`:
- Around line 131-134: The constructor currently treats max_samples using
truthiness so max_samples=0 becomes "unlimited"; change the assignment for
self.max_samples to explicitly handle None, zero, and negative values: if
max_samples is None set DEFAULT_MAX_SAMPLES, if max_samples is an int and < 0
raise ValueError (guard against negatives), otherwise set self.max_samples =
int(max_samples) allowing 0 as a valid value; update the assignment that
currently references DEFAULT_MAX_SAMPLES/self.max_samples to use this new
explicit logic (refer to self.max_samples and DEFAULT_MAX_SAMPLES in
sampler.py).
In `@cortex/monitor/storage.py`:
- Around line 46-61: The writability check in _get_db_path is incorrect: using
db_path.parent.stat() only checks existence/readability, not write permissions,
so replace that logic to verify actual write access to DEFAULT_DB_PATH's parent
(or fallback) before returning it; for example, use os.access(db_path.parent,
os.W_OK) or attempt creating a temporary file in db_path.parent and clean it up,
and if writable return str(db_path) else create USER_DB_PATH.parent and return
str(USER_DB_PATH); update references in _get_db_path and ensure downstream
callers like _ensure_tables continue to rely on the verified path.
In `@pyproject.toml`:
- Around line 58-60: Update the psutil dependency declaration to require
psutil>=6.1.0 so it supports Python 3.13; locate the psutil entry (currently
"psutil>=5.9.0") in the dependency list and change the version specifier to
"psutil>=6.1.0" ensuring no other constraints conflict with the project's
declared Python version or classifiers.
In `@tests/test_monitor.py`:
- Around line 658-719: The nested test functions (test_analyze_high_cpu_and_ram,
test_analyze_mixed_usage_cpu_only, test_analyze_borderline_thresholds,
test_analyze_single_sample) are indented inside test_analyzer_performance_score
so pytest won't collect them; move each of these functions out to top-level
(unindent them to the same indentation as test_analyzer_performance_score), keep
their bodies and imports (or remove duplicate imports if preferred), and ensure
they accept the sample_data fixture as declared so they run independently.
🧹 Nitpick comments (5)
tests/test_monitor.py (2)
24-115: Add type hints to fixtures/helpers to align with repo standardsThe new fixtures and helper functions are untyped. If the repo standard is “type hints required in Python code,” consider annotating fixture return types (e.g.,
Iterator[str],list[ResourceSample],MagicMock). As per coding guidelines, please add type hints.💡 Example adjustments
-from unittest.mock import MagicMock, patch +from collections.abc import Iterator +from unittest.mock import MagicMock, patch @@ -@pytest.fixture -def mock_psutil(): +@pytest.fixture +def mock_psutil() -> Iterator[MagicMock]: @@ -@pytest.fixture -def sample_data(): +@pytest.fixture +def sample_data() -> list["ResourceSample"]: @@ -@pytest.fixture -def temp_db(): +@pytest.fixture +def temp_db() -> Iterator[str]: @@ -@pytest.fixture -def temp_export_dir(): +@pytest.fixture +def temp_export_dir() -> Iterator[str]:
875-1019: Remove the duplicated storage test block
TestStorageand the related storage tests are duplicated in the same module. The second definition overwrites the first and makes maintenance noisy. Recommend keeping a single copy.🧹 Suggested cleanup
-# ============================================================================= -# Storage Tests -# ============================================================================= - - -class TestStorage: - """Tests for the storage module.""" - ... - -def test_storage_list_sessions_limit(temp_db): - ... - -def test_storage_get_nonexistent_session(temp_db): - ... - -def test_storage_save_samples_invalid_session(temp_db, sample_data): - ...cortex/monitor/storage.py (3)
11-17: Movetimedeltaimport to the top of the file.
timedeltais imported insidecleanup_old_sessions(line 423) but should be imported at module level alongsidedatetimefor consistency and PEP 8 compliance.Suggested fix
import json import logging import sqlite3 import uuid -from datetime import datetime +from datetime import datetime, timedelta from pathlib import Path from typing import AnyAnd remove line 423:
- from datetime import timedeltaAlso applies to: 423-423
108-112: Consider adding an index onstart_timefor query performance.
list_sessionsorders bystart_time DESCandcleanup_old_sessionsfilters onstart_time < ?. Without an index, these queries perform full table scans as sessions accumulate.Add index on start_time
# Create index for faster queries cursor.execute(""" CREATE INDEX IF NOT EXISTS idx_metrics_session ON resource_metrics(session_id) """) + cursor.execute(""" + CREATE INDEX IF NOT EXISTS idx_sessions_start_time + ON monitor_sessions(start_time) + """) + conn.commit()
179-212: Useexecutemany()for batch insert performance.Inserting samples one-by-one in a loop is inefficient for large batches. For long monitoring sessions, this could accumulate thousands of samples.
executemany()provides significantly better performance.Refactor to use executemany
try: with sqlite3.connect(self.db_path) as conn: cursor = conn.cursor() - for sample in samples: - cursor.execute( - """ - INSERT INTO resource_metrics - (session_id, timestamp, cpu_percent, cpu_count, - ram_used_gb, ram_total_gb, ram_percent, - disk_used_gb, disk_total_gb, disk_percent, - disk_read_rate, disk_write_rate, - net_recv_rate, net_sent_rate) - VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) - """, - ( - session_id, - sample.timestamp, - sample.cpu_percent, - sample.cpu_count, - sample.ram_used_gb, - sample.ram_total_gb, - sample.ram_percent, - sample.disk_used_gb, - sample.disk_total_gb, - sample.disk_percent, - sample.disk_read_rate, - sample.disk_write_rate, - sample.net_recv_rate, - sample.net_sent_rate, - ), - ) + cursor.executemany( + """ + INSERT INTO resource_metrics + (session_id, timestamp, cpu_percent, cpu_count, + ram_used_gb, ram_total_gb, ram_percent, + disk_used_gb, disk_total_gb, disk_percent, + disk_read_rate, disk_write_rate, + net_recv_rate, net_sent_rate) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) + """, + [ + ( + session_id, + s.timestamp, + s.cpu_percent, + s.cpu_count, + s.ram_used_gb, + s.ram_total_gb, + s.ram_percent, + s.disk_used_gb, + s.disk_total_gb, + s.disk_percent, + s.disk_read_rate, + s.disk_write_rate, + s.net_recv_rate, + s.net_sent_rate, + ) + for s in samples + ], + ) conn.commit()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cortex/cli.py (1)
1689-1707:--monitoris ignored for parallel installs.Monitoring is only wired into the sequential
InstallationCoordinatorpath. If a user runs--parallel --monitor, monitoring silently does nothing. Please either support monitoring in the parallel path or block this combination explicitly.🛠️ Suggested guard (short-term fix)
print(f"\n{t('install.executing')}") if parallel: + if monitor: + self._print_error("Monitoring is not supported with --parallel yet.") + return 1 import asyncio
♻️ Duplicate comments (2)
cortex/monitor/sampler.py (1)
113-138: Reject negative max_samples instead of clamping.
Line 138 usesmax(0, max_samples), so-1silently disables storage; raising aValueErrormakes misconfigurations explicit and aligns with the docstring.🛠️ Proposed fix
- else: - self.max_samples = max(0, max_samples) + else: + if max_samples < 0: + raise ValueError("max_samples must be >= 0 or None") + self.max_samples = max_samplescortex/monitor/monitor_ui.py (1)
352-364: Use the resolved export path for the write.
Line 362 still passesexport_pathtoexport_sampleseven whenactual_pathdiffers, so the exported filename can diverge from the success message. Pass the resolved path to keep I/O and messaging aligned.🛠️ Proposed fix
- export_samples(samples, export_path, peak) + export_samples(samples, actual_path, peak)
🧹 Nitpick comments (2)
tests/test_monitor.py (2)
332-358: Reduce reliance on real sleeps to avoid flaky tests.These tests depend on
time.sleep, which can be unreliable under load. Consider polling forget_sample_count()with a bounded timeout or mocking the sampling loop for determinism.
883-897: Replace the placeholder assertion with a real check.
assert Truedoesn’t validate anything. Consider asserting the subcommand is present in parser help or checking thatcli.monitoris callable from routing.
Anshgrover23
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Vaibhav701161 CI check is failing,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cortex/cli.py (1)
1689-1696:--monitoris ignored in parallel installs.Monitoring is only wired into the sequential
InstallationCoordinator; the parallel path doesn’t use the flag, so--parallel --monitorsilently drops monitoring. Consider warning/denying the combination or threading monitoring into the parallel runner.
🤖 Fix all issues with AI agents
In `@cortex/cli.py`:
- Around line 1704-1708: The print block under the monitoring path currently
only checks result.peak_cpu but formats result.peak_ram_gb unguarded, which can
raise a TypeError if peak_ram_gb is None; update the guard and formatting in
cortex/cli.py so both result.peak_cpu and result.peak_ram_gb are checked (or use
a conditional expression) before applying numeric formatters, e.g., only format
RAM as "{:.1f} GB" when result.peak_ram_gb is not None and otherwise print
"N/A"; reference the monitor flag and the result.peak_cpu and result.peak_ram_gb
fields when implementing the fix.
In `@cortex/monitor/monitor_ui.py`:
- Around line 358-364: The code computes an adjusted filename as actual_path but
still calls export_samples with the original export_path, causing inconsistency;
update the call to export_samples to pass actual_path (the resolved path that
may include the ".json" or ".csv" extension) instead of export_path, and keep
the success message using actual_path (symbols: actual_path, export_path,
export_samples, RICH_AVAILABLE, ui.console).
In `@docs/MONITORING.md`:
- Around line 52-58: Update the install-time monitoring example to clarify that
"cortex install" defaults to dry-run and will not capture install-time resource
peaks unless execution is enabled; change the example command "cortex install
nginx --monitor" to include "--execute" (e.g., "cortex install nginx --monitor
--execute") or add an explicit note beneath it stating that "--monitor" only
previews when "--execute" is not provided so readers understand monitoring
requires execution.
In `@tests/test_monitor.py`:
- Around line 23-113: Add explicit return type hints on the four fixtures:
change mock_psutil to return Iterator[unittest.mock.MagicMock], sample_data to
return list[cortex.monitor.sampler.ResourceSample] (or list[ResourceSample] if
you import ResourceSample), temp_db to return Iterator[str], and temp_export_dir
to return Iterator[str]; also add the required imports (typing.Iterator and the
MagicMock/ResourceSample symbols or use fully-qualified names) at the top of the
test file so the new annotations resolve.
♻️ Duplicate comments (1)
cortex/coordinator.py (1)
76-77: Fix undefined type references causing pipeline failure.The type annotations reference
ResourceSamplerandPeakUsagewithout importing them. The# type: ignore[name-defined]suppresses mypy but doesn't resolve the Ruff F821 linter error that's failing the CI pipeline.Use
TYPE_CHECKINGto conditionally import these types for annotation purposes only:🔧 Proposed fix
from collections.abc import Callable from dataclasses import dataclass from datetime import datetime from enum import Enum -from typing import Any +from typing import TYPE_CHECKING, Any + +if TYPE_CHECKING: + from cortex.monitor.sampler import PeakUsage, ResourceSampler from cortex.validators import DANGEROUS_PATTERNSThen update lines 76-77 to use string literals for forward references:
self.enable_monitoring = enable_monitoring - self._sampler: ResourceSampler | None = None # type: ignore[name-defined] - self._peak_usage: PeakUsage | None = None # type: ignore[name-defined] + self._sampler: "ResourceSampler | None" = None + self._peak_usage: "PeakUsage | None" = None
🧹 Nitpick comments (4)
tests/test_monitor.py (1)
331-357: Avoid fixed sleeps in sampler lifecycle tests to reduce flakiness.The hard‑coded
time.sleep(0.3)can be flaky on slower CI. Consider polling forsample_count > 0with a short timeout.♻️ Suggested approach
- # Give it time to collect a sample - time.sleep(0.3) + # Wait up to 2s for at least one sample (less flaky) + deadline = time.time() + 2.0 + while sampler.get_sample_count() == 0 and time.time() < deadline: + time.sleep(0.05)cortex/monitor/storage.py (2)
380-413: Consider using a transaction for atomicity.The
delete_sessionmethod deletes from two tables sequentially. If an error occurs after deleting metrics but before deleting the session, the database could be left in an inconsistent state. While SQLite's auto-commit onconn.commit()handles this, the intent would be clearer with explicit transaction handling.💡 Suggested improvement
The current code is functionally correct because both deletes occur before
conn.commit(), so they're already in the same transaction. This comment is just noting that the code is safe as-is.
444-454: SQL construction is safe but consider batch delete.The dynamic placeholder construction (
",".join("?" * len(session_ids))) is safe since the query uses parameterized placeholders. However, for very large cleanup operations, consider batching the deletes to avoid potential SQLite limits.cortex/monitor/sampler.py (1)
21-21: Remove unused importfield.The
fieldimport fromdataclassesis not used in this file. All dataclass fields use default values directly.🔧 Proposed fix
-from dataclasses import dataclass, field +from dataclasses import dataclass
@Anshgrover23 Fixed the issues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
cortex/cli.py (2)
1463-1471: Add a docstring to the publicinstallAPI (now expanded withmonitor).
This is a public CLI surface and should document the new flag and defaults. As per coding guidelines, please add a docstring that coversmonitorand dry‑run behavior.✍️ Suggested docstring
def install( self, software: str, execute: bool = False, dry_run: bool = False, parallel: bool = False, json_output: bool = False, monitor: bool = False, ) -> int: + """Install software via the CLI. + + Args: + software: Human-readable software description. + execute: Execute commands if True. + dry_run: Show commands without executing. + parallel: Enable parallel execution for multi-step installs. + json_output: Emit structured JSON output. + monitor: Capture peak resource usage during installation. + + Returns: + Exit code (0 for success). + """
1689-1696:--monitoris silently ignored in parallel installs.
Whenparallel=True, the code path usesrun_parallel_installwithout monitoring, so users get no metrics even when--monitoris set. Please warn and disable (or add monitoring around the parallel path) to avoid a misleading UX.🛠️ Minimal safeguard (warn + proceed without monitoring)
if parallel: + if monitor: + cx_print( + "Monitoring is not yet supported with --parallel. " + "Re-run without --parallel to capture peak usage.", + "warning", + ) import asyncio
🤖 Fix all issues with AI agents
In `@cortex/coordinator.py`:
- Around line 262-276: The current monitoring startup only catches ImportError,
so runtime failures in ResourceSampler initialization or start will propagate;
update the block that checks self.enable_monitoring to catch Exception (not just
ImportError), log a clear message via self._log including the exception, set
self._sampler = None, and continue; specifically wrap the ResourceSampler
import/creation/start (references: ResourceSampler, self._sampler =
ResourceSampler(...), self._sampler.start, self._sampler.is_running, self._log)
in a try/except Exception handler so monitoring failures are non‑fatal and the
process proceeds with monitoring disabled.
- Around line 78-80: The annotations for self._sampler: ResourceSampler | None
and self._peak_usage: PeakUsage | None are evaluated at runtime but
ResourceSampler and PeakUsage are only imported under TYPE_CHECKING, causing a
NameError; fix by either adding "from __future__ import annotations" at the top
of cortex/coordinator.py to defer annotation evaluation, or change the
annotations in the Coordinator.__init__ to forward-reference strings (e.g.,
"ResourceSampler" and "PeakUsage") so runtime evaluation does not require those
names.
In `@cortex/monitor/monitor_ui.py`:
- Around line 55-68: The constructor stores custom thresholds in self.thresholds
but alerts are generated using sampler.check_alerts(...) which still uses the
sampler's own thresholds; update MonitorUI so the sampler and UI use the same
thresholds—either by setting the sampler's threshold field to the passed-in
AlertThresholds (e.g., assign sampler.thresholds or call
sampler.set_thresholds(self.thresholds) inside MonitorUI.__init__) or by
ensuring calls to sampler.check_alerts pass self.thresholds (e.g.,
sampler.check_alerts(self.thresholds)); update references around
MonitorUI.__init__ and sampler.check_alerts to keep a single source of truth for
thresholds.
In `@tests/test_monitor.py`:
- Around line 944-953: The tests calling MonitorUI._create_bar (and other tests
using Rich classes like Text/Live) are not guarded against Rich being absent;
update the test functions (e.g., the test_create_bar that instantiates
MonitorUI(ResourceSampler()) and calls ui._create_bar) to skip when Rich isn't
installed by adding a guard at the top of each test — either call
pytest.importorskip("rich") or check monitor_ui.RICH_AVAILABLE and pytest.skip
when False — so the tests only run when Rich is available.
🧹 Nitpick comments (1)
tests/test_monitor.py (1)
65-68: Add explicit return annotations to meet typing policy.
sample_datauses a barelist, and test methods lack-> None. Please add explicit return types and apply the same pattern across tests. As per coding guidelines, type hints are required.✍️ Example adjustments
-def sample_data() -> list: +def sample_data() -> list["ResourceSample"]: """Create sample ResourceSample data for testing.""" from cortex.monitor.sampler import ResourceSample @@ - def test_sample_creation(self): + def test_sample_creation(self) -> None: """Test creating a ResourceSample."""As per coding guidelines, type hints are required.
Also applies to: 125-126
Anshgrover23
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Vaibhav701161 address all bot comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cortex/cli.py (1)
1600-1696:--monitoris ignored for parallel installs.When
parallelis True, the flag never reachesrun_parallel_install, so users won’t get monitoring despite passing--monitor. Either wire the flag through (if supported) or explicitly warn/disable it to avoid confusion.🛠️ Suggested guard (if parallel monitoring isn’t supported)
print(f"\n{t('install.executing')}") + if parallel and monitor: + cx_print( + "Resource monitoring isn't supported in parallel mode yet; running without it.", + "warning", + ) + monitor = False + if parallel: import asyncio
🤖 Fix all issues with AI agents
In `@cortex/monitor/monitor_ui.py`:
- Around line 66-70: In MonitorUI.__init__, replace assigning
self.sampler.alert_thresholds with setting the sampler's actual thresholds
attribute (e.g., self.sampler.thresholds = self.thresholds or use the sampler's
public setter if one exists) so that the sampler's check_alerts logic (which
reads sampler.thresholds) uses the same AlertThresholds instance as the UI bars;
update MonitorUI.__init__ to assign to sampler.thresholds (or call the
appropriate setter) to keep alerts and UI colors in sync.
In `@tests/test_monitor.py`:
- Around line 699-713: The assertion in test_analyze_borderline_thresholds is a
tautology; replace the current always-true check with a meaningful assertion
that validates analyze_samples behavior: after setting sample cpu_percent=79.0,
ram_percent=79.0, disk_percent=89.0, assert that result.warnings is empty
(assert len(result.warnings) == 0) or, if warnings can be non-critical, assert
that all warnings are non-critical by checking each warning's severity (e.g.,
assert all(w.severity != "critical" for w in result.warnings)); update the test
body in test_analyze_borderline_thresholds and reference analyze_samples and
result.warnings when making the assertion change.
- Around line 965-996: The test currently patches RICH_AVAILABLE after creating
the MonitorUI, so MonitorUI.__init__ already evaluated RICH_AVAILABLE and set
self.console; move the
monkeypatch.setattr("cortex.monitor.monitor_ui.RICH_AVAILABLE", False) to before
constructing the UI (before calling MonitorUI(sampler)) in
test_monitor_ui_fallback so MonitorUI uses the fallback path (or alternatively
construct a new MonitorUI after setting RICH_AVAILABLE to False); keep the
existing mocked sampler methods in place so the rest of the test behavior is
unchanged.
- Around line 956-963: Add a defensive guard to test_create_io_table by calling
pytest.importorskip("rich") at the start of the test before importing
MonitorUI/ResourceSampler; update the test_create_io_table function to import
pytest (if not already) and call pytest.importorskip("rich") so the test skips
cleanly when rich isn't available, then proceed to create
MonitorUI(ResourceSampler()) and call ui._create_io_table(sample_data[0]) as
before.
- Around line 125-146: Add explicit return type hints "-> None" to the test
functions in this diff: update the function definitions for test_sample_creation
and test_sample_defaults in tests/test_monitor.py to include "-> None" (e.g.,
def test_sample_creation() -> None:) so they conform to the project's typing
rules; no other behavior changes are needed.
Anshgrover23
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Vaibhav701161 Kindly resolve conflicts.
working in it |
ab23682 to
074c536
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
pyproject.toml (1)
114-114: Missingcortex.monitorin setuptools packages list.The new
cortex/monitor/module is not included in thepackageslist. This will cause the monitor module to be excluded from the distributed package.🔧 Suggested fix
[tool.setuptools] -packages = ["cortex", "cortex.sandbox", "cortex.utils", "cortex.llm", "cortex.kernel_features", "cortex.kernel_features.ebpf", "cortex.i18n"] +packages = ["cortex", "cortex.sandbox", "cortex.utils", "cortex.llm", "cortex.kernel_features", "cortex.kernel_features.ebpf", "cortex.i18n", "cortex.monitor"]cortex/cli.py (1)
1562-1570:--monitorflag is silently ignored when combined with--parallel.The
run_parallel_install()function (cortex/install_parallel.py:127) does not have anenable_monitoringparameter, and the parallel execution path (cortex/cli.py:1712-1714) does not pass themonitorflag to it. Meanwhile, the sequential path properly wiresenable_monitoring=monitortoInstallationCoordinator(cortex/cli.py:1794). Either pass monitoring through to the parallel path or explicitly disable/warn when both flags are used together.
🤖 Fix all issues with AI agents
In `@cortex/monitor/monitor_ui.py`:
- Around line 107-110: The loop condition treats duration=0 as falsy, causing
unintended unlimited runs; in the MonitorUI code paths that check duration (the
while self._running loops using self._start_time), change the guard from `if
duration` to `if duration is not None` so explicit 0 is honored as a timeout;
update both occurrences referenced in the diff (the loop around self._start_time
at lines shown and the second loop around lines 140-142) to use `is not None` to
correctly enforce a zero-duration limit.
In `@cortex/monitor/sampler.py`:
- Around line 163-187: start() sets the flag _running = True before resetting
shared state under _lock, which creates a race where is_running/get_samples()
can observe _running true but see stale _samples/_peak; fix by moving the
assignment of _running (or both clearing _stop_event and setting _running)
inside the same with self._lock: block that resets _samples, _peak, _prev_* and
_cpu_initialized so state reset and running flag are atomic, then create and
start the thread (self._thread = Thread(...); self._thread.start()) after
releasing the lock; ensure get_samples()/is_running() continue to read under the
same lock or use the existing synchronization.
In `@tests/test_monitor.py`:
- Around line 699-714: In test_analyze_borderline_thresholds remove the
duplicated assertion asserting result.summary's type: the test currently
contains two identical lines "assert isinstance(result.summary, str)"; delete
one of them so only a single type check remains in the test function
(test_analyze_borderline_thresholds) before the warnings length assertion.
🧹 Nitpick comments (11)
cortex/monitor/storage.py (3)
65-121: Consider enabling foreign key enforcement for data integrity.SQLite does not enforce foreign keys by default. The
FOREIGN KEYconstraint on line 106 is defined but won't be enforced unlessPRAGMA foreign_keys = ONis executed after each connection. Whiledelete_sessionmanually handles cascading deletes, this could lead to orphan records if sessions are deleted through other means.🔧 Suggested fix
def _ensure_tables(self) -> None: """Ensure required tables exist.""" try: with sqlite3.connect(self.db_path) as conn: cursor = conn.cursor() + # Enable foreign key enforcement + cursor.execute("PRAGMA foreign_keys = ON") # Create monitor_sessions tableNote: You'll also want to enable this pragma in other methods that connect to the database, or create a helper method for getting connections with FK enforcement enabled.
181-214: Consider usingexecutemanyfor batch inserts.Inserting samples one-by-one in a loop is inefficient for large sample batches. Using
executemanywould reduce the number of round-trips to the database.♻️ Suggested optimization
try: with sqlite3.connect(self.db_path) as conn: cursor = conn.cursor() - - for sample in samples: - cursor.execute( - """ - INSERT INTO resource_metrics - (session_id, timestamp, cpu_percent, cpu_count, - ram_used_gb, ram_total_gb, ram_percent, - disk_used_gb, disk_total_gb, disk_percent, - disk_read_rate, disk_write_rate, - net_recv_rate, net_sent_rate) - VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) - """, - ( - session_id, - sample.timestamp, - sample.cpu_percent, - sample.cpu_count, - sample.ram_used_gb, - sample.ram_total_gb, - sample.ram_percent, - sample.disk_used_gb, - sample.disk_total_gb, - sample.disk_percent, - sample.disk_read_rate, - sample.disk_write_rate, - sample.net_recv_rate, - sample.net_sent_rate, - ), - ) - + cursor.executemany( + """ + INSERT INTO resource_metrics + (session_id, timestamp, cpu_percent, cpu_count, + ram_used_gb, ram_total_gb, ram_percent, + disk_used_gb, disk_total_gb, disk_percent, + disk_read_rate, disk_write_rate, + net_recv_rate, net_sent_rate) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) + """, + [ + ( + session_id, + s.timestamp, + s.cpu_percent, + s.cpu_count, + s.ram_used_gb, + s.ram_total_gb, + s.ram_percent, + s.disk_used_gb, + s.disk_total_gb, + s.disk_percent, + s.disk_read_rate, + s.disk_write_rate, + s.net_recv_rate, + s.net_sent_rate, + ) + for s in samples + ], + ) conn.commit()
326-342: Note: Cumulative byte counters are not persisted.The
ResourceSamplefieldsdisk_read_bytes,disk_write_bytes,net_recv_bytes, andnet_sent_bytesare not stored in the schema or restored when retrieving samples. Only the computed rates are persisted. This is likely intentional since rates are sufficient for analysis, but be aware that retrieved samples will have these fields defaulted to 0.cortex/monitor/analyzer.py (1)
135-159: Potential duplicate recommendations at critical RAM levels.When
peak.ram_percent >= 95, the user will receive:
- A critical warning (line 142-143)
- "Memory pressure detected" recommendation (line 143-145)
- "High memory pressure may cause OOM" recommendation (lines 154-157)
The OOM check at 90% overlaps with the critical threshold logic, potentially producing redundant advice about memory pressure and swap.
🔧 Suggested fix
# Check for potential OOM risk - if peak.ram_percent >= 90: + if 90 <= peak.ram_percent < RAM_CRITICAL_THRESHOLD: recommendations.append( "High memory pressure may cause OOM killer to terminate processes. " "Consider adding swap space as a safety buffer." )tests/test_monitor.py (7)
65-94: Consider using a more specific return type forsample_datafixture.The fixture returns
-> listbut could be more precise. Other fixtures use specific types likeIterator[MagicMock]andIterator[str].💡 Suggested improvement
+from cortex.monitor.sampler import ResourceSample as ResourceSampleType + `@pytest.fixture` -def sample_data() -> list: +def sample_data() -> list[ResourceSampleType]: """Create sample ResourceSample data for testing."""Alternatively, use a string annotation to avoid import issues:
def sample_data() -> "list[ResourceSample]":
173-186: Add-> Nonereturn type hints to test methods for consistency.Many test methods across this file are missing the
-> Nonereturn type annotation. Per coding guidelines, type hints are required. TheTestResourceSampleclass methods have them (lines 125, 145), but most other test methods do not.Affected methods include those in
TestPeakUsage,TestAlertThresholds,TestResourceSampler,TestSamplerAlerts,TestExporter,TestAnalyzer,TestStorage,TestCLIIntegration,TestMonitorUI,TestResourceSamplerInternals,TestSamplerInternalBehavior, and standalone test functions.💡 Example fixes
class TestPeakUsage: """Tests for PeakUsage dataclass.""" - def test_peak_creation(self): + def test_peak_creation(self) -> None: """Test creating PeakUsage.""" - def test_peak_defaults(self): + def test_peak_defaults(self) -> None: """Test PeakUsage default values."""Apply similar changes to all other test methods missing the
-> Noneannotation.Based on learnings, type hints are required in Python code.
1142-1154: Remove unusedsample_datafixture parameter.The
sample_datafixture is declared but never used in this test. Removing it avoids unnecessary fixture instantiation.-def test_monitor_ui_stop(sample_data): +def test_monitor_ui_stop(): """Test the stop method.""" from cortex.monitor.monitor_ui import MonitorUI from cortex.monitor.sampler import ResourceSampler
884-926: CLI integration tests provide limited coverage of actual CLI parser.The tests create standalone
argparse.ArgumentParserinstances rather than testing the actual CLI parser fromcortex.cli. This verifies argument parsing logic but not that themonitorcommand and--monitorflag are correctly wired into the real CLI.Consider adding a test that invokes the actual parser:
def test_actual_cli_monitor_command(self) -> None: """Test monitor command is registered in actual CLI parser.""" from cortex.cli import create_parser # or however the parser is exposed parser = create_parser() # Parse monitor subcommand args = parser.parse_args(["monitor", "--duration", "10"]) assert args.command == "monitor" assert args.duration == 10This would provide stronger verification that the CLI wiring works as expected.
866-877: Strengthen assertion for invalid session test.The assertion
assert count >= 0is too permissive—it passes for any non-negative value, including a successful save. If the expected behavior is that saving to an invalid session returns 0 samples saved, assert that explicitly.# Try to save to non-existent session count = storage.save_samples("invalid-uuid", sample_data) - # Should return 0 or handle gracefully - assert count >= 0 + # Should return 0 since session doesn't exist + assert count == 0Alternatively, verify no samples were actually stored:
assert count == 0 assert len(storage.get_session_samples("invalid-uuid")) == 0
1188-1222: Remove redundant monkeypatch.The
MonitorUIclass is patched twice—first at line 1203, then immediately overwritten at line 1215. The first patch is unused.def test_export_integration(sample_data, temp_export_dir, monkeypatch): """Test export functionality in run_standalone_monitor.""" import os from unittest.mock import MagicMock from cortex.monitor.monitor_ui import run_standalone_monitor from cortex.monitor.sampler import PeakUsage export_path = os.path.join(temp_export_dir, "test_export.json") - # Mock the UI - mock_ui_instance = MagicMock() - mock_ui_instance.run.return_value = PeakUsage() - - mock_ui_class = MagicMock(return_value=mock_ui_instance) - monkeypatch.setattr("cortex.monitor.monitor_ui.MonitorUI", mock_ui_class) - # Mock the sampler to return our sample data mock_sampler = MagicMock() mock_sampler.get_samples.return_value = sample_data
1310-1318: Consider using properResourceSampleobjects for internal state injection.Injecting integers (
[1, 2, 3]) into_samplesworks becauseget_sample_count()only checks the list length, but this could break if the implementation is changed to validate sample types or iterate over samples.def test_get_sample_count(self): """Sample count should reflect collected samples.""" - from cortex.monitor.sampler import ResourceSampler + from cortex.monitor.sampler import ResourceSample, ResourceSampler sampler = ResourceSampler() - sampler._samples = [1, 2, 3] # direct injection + # Use minimal valid samples + sampler._samples = [ + ResourceSample(timestamp=i, cpu_percent=50.0, cpu_count=4, + ram_used_gb=8.0, ram_total_gb=16.0, ram_percent=50.0, + disk_used_gb=100.0, disk_total_gb=500.0, disk_percent=20.0) + for i in range(3) + ] assert sampler.get_sample_count() == 3
| def start(self) -> None: | ||
| """Start the background sampling thread.""" | ||
| if self._running: | ||
| logger.warning("Sampler already running") | ||
| return | ||
|
|
||
| if not PSUTIL_AVAILABLE: | ||
| logger.error("psutil not available, cannot start sampler") | ||
| return | ||
|
|
||
| self._stop_event.clear() | ||
| self._running = True | ||
|
|
||
| # Reset state for new session | ||
| with self._lock: | ||
| self._samples = [] | ||
| self._peak = PeakUsage() | ||
| self._prev_disk_io = None | ||
| self._prev_net_io = None | ||
| self._prev_sample_time = None | ||
| self._cpu_initialized = False | ||
|
|
||
| self._thread = threading.Thread(target=self._sample_loop, daemon=True) | ||
| self._thread.start() | ||
| logger.debug(f"Sampler started with interval={self.interval}s") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor thread-safety window in start().
Setting _running = True (line 174) before acquiring the lock to reset state (lines 177-183) creates a brief window where is_running returns True but _samples and _peak haven't been reset yet. This is unlikely to cause issues in practice but could lead to stale data being returned if get_samples() is called by another thread immediately after start().
🔧 Suggested fix
def start(self) -> None:
"""Start the background sampling thread."""
if self._running:
logger.warning("Sampler already running")
return
if not PSUTIL_AVAILABLE:
logger.error("psutil not available, cannot start sampler")
return
self._stop_event.clear()
- self._running = True
# Reset state for new session
with self._lock:
self._samples = []
self._peak = PeakUsage()
self._prev_disk_io = None
self._prev_net_io = None
self._prev_sample_time = None
self._cpu_initialized = False
+ self._running = True
self._thread = threading.Thread(target=self._sample_loop, daemon=True)
self._thread.start()🤖 Prompt for AI Agents
In `@cortex/monitor/sampler.py` around lines 163 - 187, start() sets the flag
_running = True before resetting shared state under _lock, which creates a race
where is_running/get_samples() can observe _running true but see stale
_samples/_peak; fix by moving the assignment of _running (or both clearing
_stop_event and setting _running) inside the same with self._lock: block that
resets _samples, _peak, _prev_* and _cpu_initialized so state reset and running
flag are atomic, then create and start the thread (self._thread = Thread(...);
self._thread.start()) after releasing the lock; ensure
get_samples()/is_running() continue to read under the same lock or use the
existing synchronization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Vaibhav701161 Please address this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
Anshgrover23
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Vaibhav701161 Kindly resolve conflicts.
Signed-off-by: Vaibhav mittal <vaibhavmittal929@gmail.com> Fix PR review comments across CLI, Monitor, and Tests Signed-off-by: Vaibhav mittal <vaibhavmittal929@gmail.com> Fix PR review comments by sonarqube Signed-off-by: Vaibhav mittal <vaibhavmittal929@gmail.com> Fix PR review comments by sonarqube-2 Signed-off-by: Vaibhav mittal <vaibhavmittal929@gmail.com> docs(monitor): document system resource monitoring usage and examples Signed-off-by: Vaibhav mittal <vaibhavmittal929@gmail.com> [autofix.ci] apply automated fixes fix: address CodeRabbit review comments [autofix.ci] apply automated fixes fix: address CodeRabbit review comments-2 fixed coderabbit reviews Signed-off-by: Vaibhav mittal <vaibhavmittal929@gmail.com>
83516f8 to
76177cb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cortex/cli.py (1)
1699-1798: Warn or block--monitorwhen--parallelis used.
monitoris only applied to the sequential coordinator path; in parallel mode it silently does nothing, which is confusing for users. Consider warning and continuing without monitoring (or explicitly rejecting the combination).🔧 Suggested guard
- if parallel: + if parallel and monitor: + cx_print( + "Monitoring is not supported with --parallel yet; continuing without monitoring.", + "warning", + ) + if parallel:
🧹 Nitpick comments (8)
cortex/monitor/storage.py (3)
48-48: Moveosimport to top of file.The
osimport on line 48 is inside a method. Per PEP 8, imports should be at the top of the file with the other imports (lines 11-17).Suggested fix
import json import logging +import os import sqlite3 import uuid from datetime import datetimeAnd remove line 48:
def _get_db_path(self) -> str: """Get the database path, with fallback to user directory.""" - import os - db_path = Path(DEFAULT_DB_PATH)
185-212: Consider usingexecutemanyfor batch inserts.The current loop inserts samples one by one. For sessions with many samples,
executemanywould be more efficient by reducing SQLite round-trips.Suggested optimization
- for sample in samples: - cursor.execute( - """ - INSERT INTO resource_metrics - (session_id, timestamp, cpu_percent, cpu_count, - ram_used_gb, ram_total_gb, ram_percent, - disk_used_gb, disk_total_gb, disk_percent, - disk_read_rate, disk_write_rate, - net_recv_rate, net_sent_rate) - VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) - """, - ( - session_id, - sample.timestamp, - ... - ), - ) + cursor.executemany( + """ + INSERT INTO resource_metrics + (session_id, timestamp, cpu_percent, cpu_count, + ram_used_gb, ram_total_gb, ram_percent, + disk_used_gb, disk_total_gb, disk_percent, + disk_read_rate, disk_write_rate, + net_recv_rate, net_sent_rate) + VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) + """, + [ + ( + session_id, + s.timestamp, + s.cpu_percent, + s.cpu_count, + s.ram_used_gb, + s.ram_total_gb, + s.ram_percent, + s.disk_used_gb, + s.disk_total_gb, + s.disk_percent, + s.disk_read_rate, + s.disk_write_rate, + s.net_recv_rate, + s.net_sent_rate, + ) + for s in samples + ], + )
67-68: Enable foreign key enforcement for data integrity.SQLite foreign keys are disabled by default. While
delete_sessionmanually handles cascading deletes, enabling FK enforcement provides an additional safety net.Suggested fix
try: with sqlite3.connect(self.db_path) as conn: + conn.execute("PRAGMA foreign_keys = ON") cursor = conn.cursor()cortex/monitor/analyzer.py (2)
140-158: Potential duplicate recommendation for high RAM usage.When
ram_percent >= 95, both the critical warning (line 142-145) and the OOM risk recommendation (line 153-157) will trigger, resulting in two similar memory-related messages. Consider combining them or adjusting the OOM threshold.Suggested consolidation
# Check peak RAM if peak.ram_percent >= RAM_CRITICAL_THRESHOLD: warnings.append(f"⚠️ RAM reached critical levels ({peak.ram_percent:.0f}%)") recommendations.append( - "Memory pressure detected. Consider increasing RAM or enabling swap." + "Memory pressure detected. Consider increasing RAM or enabling swap. " + "High memory pressure may cause OOM killer to terminate processes." ) elif peak.ram_percent >= RAM_HIGH_THRESHOLD: recommendations.append( f"Memory usage was high ({peak.ram_percent:.0f}%). " "Close unused applications during installations." ) - - # Check for potential OOM risk - if peak.ram_percent >= 90: - recommendations.append( - "High memory pressure may cause OOM killer to terminate processes. " - "Consider adding swap space as a safety buffer." - ) + elif peak.ram_percent >= 90: + recommendations.append( + "Memory pressure approaching critical levels. " + "Consider adding swap space as a safety buffer." + )
187-196: Network analyzer is a no-op placeholder.The
_analyze_networkfunction currently does nothing useful (the high throughput check passes silently). Consider either removing it or adding a TODO comment explaining future plans.Suggested documentation
def _analyze_network(_samples: list[ResourceSample], peak: PeakUsage) -> dict[str, list[str]]: - """Analyze network usage patterns.""" + """Analyze network usage patterns. + + Currently informational only. Network throughput issues are typically + external (ISP, server-side) rather than actionable on the client. + """ recommendations = [] # High network throughput is generally fine, just informational - if peak.net_recv_rate_max >= 50 * 1024 * 1024: # 50 MB/s - # This is actually good - fast downloads - pass + # TODO: Consider adding recommendations for unusually low throughput + # that might indicate network issues during package downloads. return {"recommendations": recommendations}tests/test_monitor.py (3)
65-66: Specify return type forsample_datafixture.The return type annotation is
listbut should belist[ResourceSample]for clarity and type checker support. Based on learnings, type hints are required.Suggested fix
+from cortex.monitor.sampler import ResourceSample + `@pytest.fixture` -def sample_data() -> list: +def sample_data() -> list[ResourceSample]: """Create sample ResourceSample data for testing.""" - from cortex.monitor.sampler import ResourceSampleNote: Move the import to the top of the file with other imports, or use a forward reference string
"list[ResourceSample]"if you prefer lazy imports in test fixtures.
173-185: Add-> Nonereturn type hints to test methods.Several test methods in
TestPeakUsage,TestAlertThresholds,TestResourceSampler, andTestSamplerAlertsare missing return type annotations. Per coding guidelines, type hints are required.Example fixes
- def test_peak_creation(self): + def test_peak_creation(self) -> None: """Test creating PeakUsage.""" - def test_peak_defaults(self): + def test_peak_defaults(self) -> None: """Test PeakUsage default values.""" - def test_default_thresholds(self): + def test_default_thresholds(self) -> None: """Test default alert thresholds.""" - def test_sampler_creation(self): + def test_sampler_creation(self) -> None: """Test creating a ResourceSampler."""Apply similar changes to all test methods in classes:
TestPeakUsage,TestAlertThresholds,TestResourceSampler,TestSamplerAlerts.
302-308: Mutating fixture data in tests.Tests
test_check_alerts_cpu_warningandtest_check_alerts_ram_criticaldirectly modifysample_dataelements (e.g.,sample.cpu_percent = 85.0). While safe here becausesample_datacreates a new list per test, this pattern can cause subtle bugs. Consider creating a modified copy instead.Alternative pattern
def test_check_alerts_cpu_warning(self, sample_data): """Test CPU warning alert.""" from cortex.monitor.sampler import AlertThresholds, ResourceSampler from dataclasses import replace thresholds = AlertThresholds(cpu_warning=80.0) sampler = ResourceSampler(alert_thresholds=thresholds) # Create a modified copy instead of mutating sample = replace(sample_data[9], cpu_percent=85.0) alerts = sampler.check_alerts(sample) # ...Also applies to: 318-324
@Anshgrover23 done , resolved all conflicts + all tests are passing now ! |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cortex/cli.py (1)
1791-1816:--monitoris ignored in--parallelmode.Monitoring is only threaded into the sequential
InstallationCoordinator; the parallel path never consumesmonitor, so users get no peak metrics when--parallelis set. Either wire monitoring into the parallel flow or block the combination with a clear message.✅ Possible guard (if parallel monitoring isn’t supported yet)
if execute: + if parallel and monitor: + self._print_error("Monitoring is not supported with --parallel yet.") + return 1
🤖 Fix all issues with AI agents
In `@cortex/cli.py`:
- Around line 5000-5035: The monitor parser should reject non-positive
duration/interval values: add two argparse type validator helpers (e.g.,
_positive_int and _positive_float) and use them when adding the
"--duration"/"-d" and "--interval"/"-i" arguments on monitor_parser (replace
type=int with type=_positive_int for duration and type=float with
type=_positive_float for interval while keeping the default for interval as
1.0); ensure the validators raise argparse.ArgumentTypeError on values <= 0 so
invalid inputs are rejected at parse time.
🧹 Nitpick comments (1)
cortex/cli.py (1)
1564-1573: Add a docstring for the publicinstallAPI and documentmonitor.This method is public and now has a new parameter; please document it to align with project standards.
As per coding guidelines, please add docstrings for public APIs. [scratchpad_end] -->📝 Suggested docstring
def install( self, software: str, execute: bool = False, dry_run: bool = False, max_retries: int = DEFAULT_MAX_RETRIES, monitor: bool = False, parallel: bool = False, json_output: bool = False, ) -> int: + """Install software requested by the user. + + Args: + software: Human-readable software request. + execute: Whether to run commands. + dry_run: Preview commands without executing. + max_retries: Max retries for failed steps. + monitor: Enable resource monitoring during execution. + parallel: Enable parallel execution. + json_output: Emit JSON response payload. + Returns: + Exit code (0 for success). + """
| install_parser.add_argument( | ||
| "--monitor", | ||
| action="store_true", | ||
| help="Monitor system resources during installation", | ||
| ) | ||
|
|
||
| # Monitor command - real-time system resource monitoring | ||
| # Note: Monitoring is client-side using psutil. Daemon integration is intentionally | ||
| # out of scope to keep the feature self-contained and avoid cortexd dependencies. | ||
| monitor_parser = subparsers.add_parser( | ||
| "monitor", | ||
| help="Monitor system resource usage", | ||
| description="Track CPU, RAM, Disk, and Network usage in real-time.", | ||
| ) | ||
| monitor_parser.add_argument( | ||
| "--duration", | ||
| "-d", | ||
| type=int, | ||
| metavar="SECONDS", | ||
| help="Run for fixed duration (seconds); omit for continuous monitoring", | ||
| ) | ||
| monitor_parser.add_argument( | ||
| "--interval", | ||
| "-i", | ||
| type=float, | ||
| default=1.0, | ||
| metavar="SECONDS", | ||
| help="Sampling interval in seconds (default: 1.0)", | ||
| ) | ||
| monitor_parser.add_argument( | ||
| "--export", | ||
| "-e", | ||
| type=str, | ||
| metavar="FILE", | ||
| help="Export metrics to file (JSON or CSV). Experimental feature.", | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validate monitor duration/interval to reject non‑positive values.
--interval 0 (or negative) can cause a tight loop or runtime errors; --duration 0 is also confusing. Add simple argparse validators to enforce positive numbers.
🔧 Suggested argparse validation
- monitor_parser.add_argument(
+ monitor_parser.add_argument(
"--duration",
"-d",
- type=int,
+ type=_positive_int,
metavar="SECONDS",
help="Run for fixed duration (seconds); omit for continuous monitoring",
)
monitor_parser.add_argument(
"--interval",
"-i",
- type=float,
+ type=_positive_float,
default=1.0,
metavar="SECONDS",
help="Sampling interval in seconds (default: 1.0)",
)You can define helpers near the parser setup:
def _positive_int(value: str) -> int:
v = int(value)
if v <= 0:
raise argparse.ArgumentTypeError("Value must be > 0")
return v
def _positive_float(value: str) -> float:
v = float(value)
if v <= 0:
raise argparse.ArgumentTypeError("Value must be > 0")
return v🤖 Prompt for AI Agents
In `@cortex/cli.py` around lines 5000 - 5035, The monitor parser should reject
non-positive duration/interval values: add two argparse type validator helpers
(e.g., _positive_int and _positive_float) and use them when adding the
"--duration"/"-d" and "--interval"/"-i" arguments on monitor_parser (replace
type=int with type=_positive_int for duration and type=float with
type=_positive_float for interval while keeping the default for interval as
1.0); ensure the validators raise argparse.ArgumentTypeError on values <= 0 so
invalid inputs are rejected at parse time.



Related Issue
Closes #51
Summary
This PR introduces system resource monitoring to Cortex, providing visibility into CPU, RAM, Disk, and Network usage both standalone and during installations.
What’s included
Standalone monitoring
cortex monitorlaunches a real-time TUI dashboardInstall-time monitoring
cortex install <package> --monitorPersistent storage
Advisory analysis
Graceful fallbacks
psutilorrichis unavailableMemory-safe design
Implementation Overview
New
cortex/monitor/module:sampler.py– thread-safe resource sampling viapsutilmonitor_ui.py– real-time TUI usingrich.Livestorage.py– SQLite-backed persistenceexporter.py– JSON / CSV exportanalyzer.py– advisory recommendationsCLI integration:
monitorcommand--monitorflag wired throughinstall→InstallationCoordinatorBackward-compatible extension of
InstallationResultto include peak metricsImplementationExport functionalityMonitoring InstallationStress Checkcortex.stress.mp4
Testing
Comprehensive unit tests for:
Manual verification:
Coverage target met (>80% for
cortex/monitor/)cortex.test.coverage.mp4
Non-Goals (Intentional)
AI Disclosure
Used AI assistance (Google Antigravity) for:
All code, design decisions, and final implementations were reviewed, validated, and iterated manually.
Checklist
feat(monitor): add system resource monitoringpytest tests/)Summary by CodeRabbit
New Features
Tests
Chores
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.