Template

Files

lmiranda 9698e8724d feat(plugins): implement Sprint 4 commands (#241-#258)

Sprint 4 - Plugin Commands implementation adding 18 new user-facing
commands across 8 plugins as part of V5.2.0 Plugin Enhancements.

**projman:**
- #241: /sprint-diagram - Mermaid visualization of sprint issues

**pr-review:**
- #242: Confidence threshold config (PR_REVIEW_CONFIDENCE_THRESHOLD)
- #243: /pr-diff - Formatted diff with inline review comments

**data-platform:**
- #244: /data-quality - DataFrame quality checks (nulls, duplicates, outliers)
- #245: /lineage-viz - dbt lineage as Mermaid diagrams
- #246: /dbt-test - Formatted dbt test runner

**viz-platform:**
- #247: /chart-export - Export charts to PNG/SVG/PDF via kaleido
- #248: /accessibility-check - Color blind validation (WCAG contrast)
- #249: /breakpoints - Responsive layout configuration

**contract-validator:**
- #250: /dependency-graph - Plugin dependency visualization

**doc-guardian:**
- #251: /changelog-gen - Generate changelog from conventional commits
- #252: /doc-coverage - Documentation coverage metrics
- #253: /stale-docs - Flag outdated documentation

**claude-config-maintainer:**
- #254: /config-diff - Track CLAUDE.md changes over time
- #255: /config-lint - 31 lint rules for CLAUDE.md best practices

**cmdb-assistant:**
- #256: /cmdb-topology - Infrastructure topology diagrams
- #257: /change-audit - NetBox audit trail queries
- #258: /ip-conflicts - Detect IP conflicts and overlaps

Closes #241, #242, #243, #244, #245, #246, #247, #248, #249,
#250, #251, #252, #253, #254, #255, #256, #257, #258

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-28 12:02:26 -05:00

3.5 KiB

Raw Blame History

Confidence Scoring for PR Review

Purpose

Confidence scoring ensures that review findings are calibrated and actionable. By filtering out low-confidence findings, we reduce noise and focus reviewer attention on real issues.

Score Ranges

Range	Label	Meaning	Action
0.9 - 1.0	HIGH	Definite issue	Must address
0.7 - 0.89	MEDIUM	Likely issue	Should address
0.5 - 0.69	LOW	Possible concern	Consider addressing
< 0.5	SUPPRESSED	Uncertain	Don't report

Scoring Factors

Positive Factors (Increase Confidence)

Factor	Impact
Clear data flow from source to sink	+0.3
Pattern matches known vulnerability	+0.2
No intervening validation visible	+0.2
Matches OWASP Top 10	+0.15
Found in security-sensitive context	+0.1

Negative Factors (Decrease Confidence)

Factor	Impact
Validation might exist elsewhere	-0.2
Depends on runtime configuration	-0.15
Pattern is common but often safe	-0.15
Requires multiple conditions to exploit	-0.1
Theoretical impact only	-0.1

Calibration Guidelines

Security Issues

Base confidence by pattern:

SQL string concatenation with user input: 0.95
Hardcoded credentials: 0.9
Missing auth check: 0.8
Generic error exposure: 0.6
Missing rate limiting: 0.5

Performance Issues

Base confidence by pattern:

Clear N+1 in loop: 0.9
SELECT * on large table: 0.7
Missing index on filtered column: 0.6
Suboptimal algorithm: 0.5

Maintainability Issues

Base confidence by pattern:

Function >100 lines: 0.8
Deep nesting >4 levels: 0.75
Duplicate code blocks: 0.7
Unclear naming: 0.6
Minor style issues: 0.3 (suppress)

Test Coverage

Base confidence by pattern:

No test file for new module: 0.9
Security function untested: 0.85
Edge case not covered: 0.6
Simple getter untested: 0.3 (suppress)

Threshold Configuration

The default threshold is 0.7 (showing MEDIUM and HIGH confidence findings). This can be adjusted:

PR_REVIEW_CONFIDENCE_THRESHOLD=0.9  # Only definite issues (HIGH)
PR_REVIEW_CONFIDENCE_THRESHOLD=0.7  # Likely issues and above (MEDIUM+HIGH) - default
PR_REVIEW_CONFIDENCE_THRESHOLD=0.5  # Include possible concerns (LOW+)
PR_REVIEW_CONFIDENCE_THRESHOLD=0.3  # Include more speculative

Example Scoring

High Confidence (0.95)

// Clear SQL injection
const query = `SELECT * FROM users WHERE id = ${req.params.id}`;

User input (req.params.id): +0.3
Direct to SQL query: +0.3
No visible validation: +0.2
Matches OWASP Top 10: +0.15
Total: 0.95

Medium Confidence (0.72)

// Possible performance issue
users.forEach(async (user) => {
  const orders = await db.orders.find({ userId: user.id });
});

Loop with query: +0.3
Pattern matches N+1: +0.2
But might be small dataset: -0.15
Could have caching: -0.1
Total: 0.72

Low Confidence (0.55)

// Maybe too complex?
function processOrder(order, user, items, discounts, shipping) {
  // 60 lines of logic
}

Function is long: +0.2
Many parameters: +0.15
But might be intentional: -0.1
Could be refactored later: -0.1
Total: 0.55

Suppressed (0.35)

// Minor style preference
const x = foo ? bar : baz;

Ternary could be if/else: +0.1
Very common pattern: -0.2
No real impact: -0.1
Style preference: -0.1
Total: 0.35 (suppressed)

3.5 KiB Raw Blame History

Confidence Scoring for PR Review

Purpose

Score Ranges

Scoring Factors

Positive Factors (Increase Confidence)

Negative Factors (Decrease Confidence)

Calibration Guidelines

Security Issues

Performance Issues

Maintainability Issues

Test Coverage

Threshold Configuration

Example Scoring

High Confidence (0.95)

Medium Confidence (0.72)

Low Confidence (0.55)

Suppressed (0.35)

3.5 KiB

Raw Blame History