Template

Files

lmiranda e5ca804692 feat: v3.0.0 architecture overhaul

- Rename marketplace to lm-claude-plugins
- Move MCP servers to root with symlinks
- Add 6 PR tools to Gitea MCP (list_pull_requests, get_pull_request,
  get_pr_diff, get_pr_comments, create_pr_review, add_pr_comment)
- Add clarity-assist plugin (prompt optimization with ND accommodations)
- Add git-flow plugin (workflow automation)
- Add pr-review plugin (multi-agent review with confidence scoring)
- Centralize configuration docs
- Update all documentation for v3.0.0

BREAKING CHANGE: MCP server paths changed, marketplace renamed

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-20 16:56:53 -05:00

3.3 KiB

Raw Blame History

Confidence Scoring for PR Review

Purpose

Confidence scoring ensures that review findings are calibrated and actionable. By filtering out low-confidence findings, we reduce noise and focus reviewer attention on real issues.

Score Ranges

Range	Label	Meaning	Action
0.9 - 1.0	HIGH	Definite issue	Must address
0.7 - 0.89	MEDIUM	Likely issue	Should address
0.5 - 0.69	LOW	Possible concern	Consider addressing
< 0.5	SUPPRESSED	Uncertain	Don't report

Scoring Factors

Positive Factors (Increase Confidence)

Factor	Impact
Clear data flow from source to sink	+0.3
Pattern matches known vulnerability	+0.2
No intervening validation visible	+0.2
Matches OWASP Top 10	+0.15
Found in security-sensitive context	+0.1

Negative Factors (Decrease Confidence)

Factor	Impact
Validation might exist elsewhere	-0.2
Depends on runtime configuration	-0.15
Pattern is common but often safe	-0.15
Requires multiple conditions to exploit	-0.1
Theoretical impact only	-0.1

Calibration Guidelines

Security Issues

Base confidence by pattern:

SQL string concatenation with user input: 0.95
Hardcoded credentials: 0.9
Missing auth check: 0.8
Generic error exposure: 0.6
Missing rate limiting: 0.5

Performance Issues

Base confidence by pattern:

Clear N+1 in loop: 0.9
SELECT * on large table: 0.7
Missing index on filtered column: 0.6
Suboptimal algorithm: 0.5

Maintainability Issues

Base confidence by pattern:

Function >100 lines: 0.8
Deep nesting >4 levels: 0.75
Duplicate code blocks: 0.7
Unclear naming: 0.6
Minor style issues: 0.3 (suppress)

Test Coverage

Base confidence by pattern:

No test file for new module: 0.9
Security function untested: 0.85
Edge case not covered: 0.6
Simple getter untested: 0.3 (suppress)

Threshold Configuration

The default threshold is 0.5. This can be adjusted:

PR_REVIEW_CONFIDENCE_THRESHOLD=0.7  # Only high-confidence
PR_REVIEW_CONFIDENCE_THRESHOLD=0.3  # Include more speculative

Example Scoring

High Confidence (0.95)

// Clear SQL injection
const query = `SELECT * FROM users WHERE id = ${req.params.id}`;

User input (req.params.id): +0.3
Direct to SQL query: +0.3
No visible validation: +0.2
Matches OWASP Top 10: +0.15
Total: 0.95

Medium Confidence (0.72)

// Possible performance issue
users.forEach(async (user) => {
  const orders = await db.orders.find({ userId: user.id });
});

Loop with query: +0.3
Pattern matches N+1: +0.2
But might be small dataset: -0.15
Could have caching: -0.1
Total: 0.72

Low Confidence (0.55)

// Maybe too complex?
function processOrder(order, user, items, discounts, shipping) {
  // 60 lines of logic
}

Function is long: +0.2
Many parameters: +0.15
But might be intentional: -0.1
Could be refactored later: -0.1
Total: 0.55

Suppressed (0.35)

// Minor style preference
const x = foo ? bar : baz;

Ternary could be if/else: +0.1
Very common pattern: -0.2
No real impact: -0.1
Style preference: -0.1
Total: 0.35 (suppressed)

3.3 KiB Raw Blame History

Confidence Scoring for PR Review

Purpose

Score Ranges

Scoring Factors

Positive Factors (Increase Confidence)

Negative Factors (Decrease Confidence)

Calibration Guidelines

Security Issues

Performance Issues

Maintainability Issues

Test Coverage

Threshold Configuration

Example Scoring

High Confidence (0.95)

Medium Confidence (0.72)

Low Confidence (0.55)

Suppressed (0.35)

3.3 KiB

Raw Blame History