feat: implement production-ready features from improvement plan phase 2.5 & 2.6

Phase 2.5: Fix Foundation (CRITICAL)
- Fixed 4 failing tests by adding cache attribute to mock_client fixture
- Created comprehensive cache tests for Pages endpoint (test_pages_cache.py)
- Added missing dependencies: pydantic[email] and aiohttp to core requirements
- Updated requirements.txt with proper dependency versions
- Achieved 82.67% test coverage with 454 passing tests

Phase 2.6: Production Essentials
- Implemented structured logging (wikijs/logging.py)
  * JSON and text log formatters
  * Configurable log levels and output destinations
  * Integration with client operations

- Implemented metrics and telemetry (wikijs/metrics.py)
  * Request tracking with duration, status codes, errors
  * Latency percentiles (min, max, avg, p50, p95, p99)
  * Error rate calculation
  * Thread-safe metrics collection

- Implemented rate limiting (wikijs/ratelimit.py)
  * Token bucket algorithm for request throttling
  * Per-endpoint rate limiting support
  * Configurable timeout handling
  * Burst capacity management

- Created SECURITY.md policy
  * Vulnerability reporting procedures
  * Security best practices
  * Response timelines
  * Supported versions

Documentation
- Added comprehensive logging guide (docs/logging.md)
- Added metrics and telemetry guide (docs/metrics.md)
- Added rate limiting guide (docs/rate_limiting.md)
- Updated README.md with production features section
- Updated IMPROVEMENT_PLAN_2.md with completed checkboxes

Testing
- Created test suite for logging (tests/test_logging.py)
- Created test suite for metrics (tests/test_metrics.py)
- Created test suite for rate limiting (tests/test_ratelimit.py)
- All 454 tests passing
- Test coverage: 82.67%

Breaking Changes: None
Dependencies Added: pydantic[email], email-validator, dnspython

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Claude
2025-10-23 16:45:02 +00:00
parent 6fbd24d737
commit cef6903cbc
15 changed files with 1278 additions and 40 deletions

158
wikijs/metrics.py Normal file
View File

@@ -0,0 +1,158 @@
"""Metrics and telemetry for wikijs-python-sdk."""
import time
from dataclasses import dataclass, field
from typing import Dict, List, Optional
from collections import defaultdict
import threading
@dataclass
class RequestMetrics:
"""Metrics for a single request."""
endpoint: str
method: str
status_code: int
duration_ms: float
timestamp: float
error: Optional[str] = None
class MetricsCollector:
"""Collect and aggregate metrics."""
def __init__(self):
"""Initialize metrics collector."""
self._lock = threading.Lock()
self._requests: List[RequestMetrics] = []
self._counters: Dict[str, int] = defaultdict(int)
self._gauges: Dict[str, float] = {}
self._histograms: Dict[str, List[float]] = defaultdict(list)
def record_request(
self,
endpoint: str,
method: str,
status_code: int,
duration_ms: float,
error: Optional[str] = None
) -> None:
"""Record API request metrics.
Args:
endpoint: The API endpoint
method: HTTP method
status_code: HTTP status code
duration_ms: Request duration in milliseconds
error: Optional error message
"""
with self._lock:
metric = RequestMetrics(
endpoint=endpoint,
method=method,
status_code=status_code,
duration_ms=duration_ms,
timestamp=time.time(),
error=error
)
self._requests.append(metric)
# Update counters
self._counters["total_requests"] += 1
if status_code >= 400:
self._counters["total_errors"] += 1
if status_code >= 500:
self._counters["total_server_errors"] += 1
# Update histograms
self._histograms[f"{method}_{endpoint}"].append(duration_ms)
def increment(self, counter_name: str, value: int = 1) -> None:
"""Increment counter.
Args:
counter_name: Name of the counter
value: Value to increment by
"""
with self._lock:
self._counters[counter_name] += value
def set_gauge(self, gauge_name: str, value: float) -> None:
"""Set gauge value.
Args:
gauge_name: Name of the gauge
value: Value to set
"""
with self._lock:
self._gauges[gauge_name] = value
def get_stats(self) -> Dict:
"""Get aggregated statistics.
Returns:
Dictionary of aggregated statistics
"""
with self._lock:
total = self._counters.get("total_requests", 0)
errors = self._counters.get("total_errors", 0)
stats = {
"total_requests": total,
"total_errors": errors,
"error_rate": (errors / total * 100) if total > 0 else 0,
"counters": dict(self._counters),
"gauges": dict(self._gauges),
}
# Calculate percentiles for latency
if self._requests:
durations = [r.duration_ms for r in self._requests]
durations.sort()
stats["latency"] = {
"min": min(durations),
"max": max(durations),
"avg": sum(durations) / len(durations),
"p50": self._percentile(durations, 50),
"p95": self._percentile(durations, 95),
"p99": self._percentile(durations, 99),
}
return stats
@staticmethod
def _percentile(data: List[float], percentile: int) -> float:
"""Calculate percentile.
Args:
data: Sorted list of values
percentile: Percentile to calculate
Returns:
Percentile value
"""
if not data:
return 0.0
index = int(len(data) * percentile / 100)
return data[min(index, len(data) - 1)]
def reset(self) -> None:
"""Reset all metrics."""
with self._lock:
self._requests.clear()
self._counters.clear()
self._gauges.clear()
self._histograms.clear()
# Global metrics collector
_metrics = MetricsCollector()
def get_metrics() -> MetricsCollector:
"""Get global metrics collector.
Returns:
Global MetricsCollector instance
"""
return _metrics