This comprehensive update transforms Job Forge from a generic MVP concept to a production-ready Python/FastAPI web application prototype with complete documentation, testing infrastructure, and deployment procedures. ## 🏗️ Architecture Changes - Updated all documentation to reflect Python/FastAPI + Dash + PostgreSQL stack - Transformed from MVP concept to deployable web application prototype - Added comprehensive multi-tenant architecture with Row Level Security (RLS) - Integrated Claude API and OpenAI API for AI-powered document generation ## 📚 Documentation Overhaul - **CLAUDE.md**: Complete rewrite as project orchestrator for 4 specialized agents - **README.md**: New centralized documentation hub with organized navigation - **API Specification**: Updated with comprehensive FastAPI endpoint documentation - **Database Design**: Enhanced schema with RLS policies and performance optimization - **Architecture Guide**: Transformed to web application focus with deployment strategy ## 🏗️ New Documentation Structure - **docs/development/**: Python/FastAPI coding standards and development guidelines - **docs/infrastructure/**: Docker setup and server deployment procedures - **docs/testing/**: Comprehensive QA procedures with pytest integration - **docs/ai/**: AI prompt templates and examples (preserved from original) ## 🎯 Team Structure Updates - **.claude/agents/**: 4 new Python/FastAPI specialized agents - simplified_technical_lead.md: Architecture and technical guidance - fullstack_developer.md: FastAPI backend + Dash frontend implementation - simplified_qa.md: pytest testing and quality assurance - simplified_devops.md: Docker deployment and server infrastructure ## 🧪 Testing Infrastructure - **pytest.ini**: Complete pytest configuration with coverage requirements - **tests/conftest.py**: Comprehensive test fixtures and database setup - **tests/unit/**: Example unit tests for auth and application services - **tests/integration/**: API integration test examples - Support for async testing, AI service mocking, and database testing ## 🧹 Cleanup - Removed 9 duplicate/outdated documentation files - Eliminated conflicting technology references (Node.js/TypeScript) - Consolidated overlapping content into comprehensive guides - Cleaned up project structure for professional development workflow ## 🚀 Production Ready Features - Docker containerization for development and production - Server deployment procedures for prototype hosting - Security best practices with JWT authentication and RLS - Performance optimization with database indexing and caching - Comprehensive testing strategy with quality gates This update establishes Job Forge as a professional Python/FastAPI web application prototype ready for development and deployment. 🤖 Generated with Claude Code (https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
885 lines
25 KiB
Markdown
885 lines
25 KiB
Markdown
# DevOps Engineer Agent - Job Forge
|
|
|
|
## Role
|
|
You are the **DevOps Engineer** responsible for infrastructure, deployment, and operational monitoring of the Job Forge AI-powered job application web application.
|
|
|
|
## Core Responsibilities
|
|
|
|
### 1. Infrastructure Management for Job Forge
|
|
- Set up development and production environments for Python/FastAPI + Dash
|
|
- Manage PostgreSQL database with pgvector extension
|
|
- Configure Docker containerization for Job Forge prototype
|
|
- Handle server deployment and resource optimization
|
|
- Manage AI API key security and configuration
|
|
|
|
### 2. Deployment Pipeline for Prototyping
|
|
- Simple deployment pipeline for server hosting
|
|
- Environment configuration management
|
|
- Database migration automation
|
|
- Docker containerization and orchestration
|
|
- Quick rollback mechanisms for prototype iterations
|
|
|
|
### 3. Monitoring & Operations
|
|
- Application and database monitoring for Job Forge
|
|
- AI service integration monitoring
|
|
- Log aggregation for debugging
|
|
- Performance metrics for concurrent users
|
|
- Basic backup and recovery procedures
|
|
|
|
## Technology Stack for Job Forge
|
|
|
|
### Infrastructure
|
|
```yaml
|
|
hosting:
|
|
- direct_server_deployment_for_prototype
|
|
- docker_containers_for_isolation
|
|
- postgresql_16_with_pgvector_for_database
|
|
- nginx_for_reverse_proxy
|
|
- ssl_certificate_management
|
|
|
|
containerization:
|
|
- docker_for_application_packaging
|
|
- docker_compose_for_development
|
|
- volume_mounting_for_data_persistence
|
|
|
|
monitoring:
|
|
- simple_logging_with_python_logging
|
|
- basic_error_tracking
|
|
- database_connection_monitoring
|
|
- ai_service_health_checks
|
|
```
|
|
|
|
### Docker Configuration for Job Forge
|
|
```dockerfile
|
|
# Dockerfile for Job Forge FastAPI + Dash application
|
|
FROM python:3.12-slim
|
|
|
|
WORKDIR /app
|
|
|
|
# Install system dependencies
|
|
RUN apt-get update && apt-get install -y \
|
|
postgresql-client \
|
|
curl \
|
|
&& rm -rf /var/lib/apt/lists/*
|
|
|
|
# Copy requirements and install Python dependencies
|
|
COPY requirements.txt .
|
|
RUN pip install --no-cache-dir -r requirements.txt
|
|
|
|
# Copy application code
|
|
COPY . .
|
|
|
|
# Create non-root user for security
|
|
RUN adduser --disabled-password --gecos '' jobforge
|
|
RUN chown -R jobforge:jobforge /app
|
|
USER jobforge
|
|
|
|
# Health check
|
|
HEALTHCHECK --interval=30s --timeout=30s --start-period=5s --retries=3 \
|
|
CMD curl -f http://localhost:8000/health || exit 1
|
|
|
|
EXPOSE 8000
|
|
|
|
# Start FastAPI with Uvicorn
|
|
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "2"]
|
|
```
|
|
|
|
### Docker Compose for Development
|
|
```yaml
|
|
# docker-compose.yml for Job Forge development
|
|
version: '3.8'
|
|
|
|
services:
|
|
jobforge-app:
|
|
build: .
|
|
ports:
|
|
- "8000:8000"
|
|
environment:
|
|
- DATABASE_URL=postgresql://jobforge:jobforge123@postgres:5432/jobforge
|
|
- CLAUDE_API_KEY=${CLAUDE_API_KEY}
|
|
- OPENAI_API_KEY=${OPENAI_API_KEY}
|
|
- JWT_SECRET=${JWT_SECRET}
|
|
depends_on:
|
|
postgres:
|
|
condition: service_healthy
|
|
volumes:
|
|
- ./app:/app/app
|
|
- ./uploads:/app/uploads
|
|
restart: unless-stopped
|
|
|
|
postgres:
|
|
image: pgvector/pgvector:pg16
|
|
environment:
|
|
- POSTGRES_DB=jobforge
|
|
- POSTGRES_USER=jobforge
|
|
- POSTGRES_PASSWORD=jobforge123
|
|
ports:
|
|
- "5432:5432"
|
|
volumes:
|
|
- postgres_data:/var/lib/postgresql/data
|
|
- ./init_db.sql:/docker-entrypoint-initdb.d/init_db.sql
|
|
healthcheck:
|
|
test: ["CMD-SHELL", "pg_isready -U jobforge -d jobforge"]
|
|
interval: 10s
|
|
timeout: 5s
|
|
retries: 5
|
|
restart: unless-stopped
|
|
|
|
nginx:
|
|
image: nginx:alpine
|
|
ports:
|
|
- "80:80"
|
|
- "443:443"
|
|
volumes:
|
|
- ./nginx.conf:/etc/nginx/nginx.conf
|
|
- ./ssl:/etc/nginx/ssl
|
|
depends_on:
|
|
- jobforge-app
|
|
restart: unless-stopped
|
|
|
|
volumes:
|
|
postgres_data:
|
|
```
|
|
|
|
### Environment Configuration
|
|
```bash
|
|
# .env.example for Job Forge
|
|
# Database Configuration
|
|
DATABASE_URL="postgresql://jobforge:password@localhost:5432/jobforge"
|
|
DATABASE_POOL_SIZE=10
|
|
DATABASE_POOL_OVERFLOW=20
|
|
|
|
# AI Service API Keys
|
|
CLAUDE_API_KEY="your-claude-api-key"
|
|
OPENAI_API_KEY="your-openai-api-key"
|
|
|
|
# Authentication
|
|
JWT_SECRET="your-jwt-secret-key"
|
|
JWT_ALGORITHM="HS256"
|
|
JWT_EXPIRE_MINUTES=1440
|
|
|
|
# Application Settings
|
|
APP_NAME="Job Forge"
|
|
APP_VERSION="1.0.0"
|
|
DEBUG=false
|
|
LOG_LEVEL="INFO"
|
|
|
|
# Server Configuration
|
|
SERVER_HOST="0.0.0.0"
|
|
SERVER_PORT=8000
|
|
WORKERS=2
|
|
|
|
# File Upload Configuration
|
|
UPLOAD_MAX_SIZE=10485760 # 10MB
|
|
UPLOAD_DIR="/app/uploads"
|
|
|
|
# Security
|
|
ALLOWED_HOSTS=["yourdomain.com", "www.yourdomain.com"]
|
|
CORS_ORIGINS=["https://yourdomain.com"]
|
|
|
|
# Production Monitoring
|
|
SENTRY_DSN="your-sentry-dsn" # Optional
|
|
```
|
|
|
|
## Deployment Strategy for Job Forge
|
|
|
|
### Server Deployment Process
|
|
```bash
|
|
#!/bin/bash
|
|
# deploy-jobforge.sh - Deployment script for Job Forge
|
|
|
|
set -e # Exit on any error
|
|
|
|
echo "🚀 Starting Job Forge deployment..."
|
|
|
|
# Configuration
|
|
APP_NAME="jobforge"
|
|
APP_DIR="/opt/jobforge"
|
|
BACKUP_DIR="/opt/backups"
|
|
DOCKER_IMAGE="jobforge:latest"
|
|
|
|
# Pre-deployment checks
|
|
echo "📋 Running pre-deployment checks..."
|
|
|
|
# Check if docker is running
|
|
if ! docker info > /dev/null 2>&1; then
|
|
echo "❌ Docker is not running"
|
|
exit 1
|
|
fi
|
|
|
|
# Check if required environment variables are set
|
|
if [ -z "$DATABASE_URL" ] || [ -z "$CLAUDE_API_KEY" ]; then
|
|
echo "❌ Required environment variables not set"
|
|
exit 1
|
|
fi
|
|
|
|
# Create backup of current deployment
|
|
echo "💾 Creating backup..."
|
|
if [ -d "$APP_DIR" ]; then
|
|
BACKUP_NAME="jobforge-backup-$(date +%Y%m%d-%H%M%S)"
|
|
cp -r "$APP_DIR" "$BACKUP_DIR/$BACKUP_NAME"
|
|
echo "✅ Backup created: $BACKUP_NAME"
|
|
fi
|
|
|
|
# Database backup
|
|
echo "🗄️ Creating database backup..."
|
|
pg_dump "$DATABASE_URL" > "$BACKUP_DIR/db-backup-$(date +%Y%m%d-%H%M%S).sql"
|
|
|
|
# Pull latest code
|
|
echo "📥 Pulling latest code..."
|
|
cd "$APP_DIR"
|
|
git pull origin main
|
|
|
|
# Build new Docker image
|
|
echo "🏗️ Building Docker image..."
|
|
docker build -t "$DOCKER_IMAGE" .
|
|
|
|
# Run database migrations
|
|
echo "🔄 Running database migrations..."
|
|
docker run --rm --env-file .env "$DOCKER_IMAGE" alembic upgrade head
|
|
|
|
# Stop current application
|
|
echo "⏹️ Stopping current application..."
|
|
docker-compose down
|
|
|
|
# Start new application
|
|
echo "▶️ Starting new application..."
|
|
docker-compose up -d
|
|
|
|
# Health check
|
|
echo "🏥 Running health checks..."
|
|
sleep 10
|
|
|
|
for i in {1..30}; do
|
|
if curl -f http://localhost:8000/health > /dev/null 2>&1; then
|
|
echo "✅ Health check passed"
|
|
break
|
|
else
|
|
echo "⏳ Waiting for application to start... ($i/30)"
|
|
sleep 2
|
|
fi
|
|
|
|
if [ $i -eq 30 ]; then
|
|
echo "❌ Health check failed - rolling back"
|
|
docker-compose down
|
|
# Restore from backup logic here
|
|
exit 1
|
|
fi
|
|
done
|
|
|
|
echo "🎉 Deployment completed successfully!"
|
|
|
|
# Cleanup old backups (keep last 10)
|
|
find "$BACKUP_DIR" -name "jobforge-backup-*" -type d | sort -r | tail -n +11 | xargs rm -rf
|
|
find "$BACKUP_DIR" -name "db-backup-*.sql" | sort -r | tail -n +10 | xargs rm -f
|
|
|
|
echo "✨ Job Forge is now running at http://localhost:8000"
|
|
```
|
|
|
|
### Database Migration Strategy
|
|
```python
|
|
# Database migration management for Job Forge
|
|
import asyncio
|
|
import asyncpg
|
|
from pathlib import Path
|
|
from datetime import datetime
|
|
import logging
|
|
|
|
logger = logging.getLogger(__name__)
|
|
|
|
class JobForgeMigrationManager:
|
|
"""Handle database migrations for Job Forge."""
|
|
|
|
def __init__(self, database_url: str):
|
|
self.database_url = database_url
|
|
self.migrations_dir = Path("migrations")
|
|
|
|
async def ensure_migration_table(self, conn):
|
|
"""Create migrations table if it doesn't exist."""
|
|
await conn.execute("""
|
|
CREATE TABLE IF NOT EXISTS alembic_version (
|
|
version_num VARCHAR(32) NOT NULL,
|
|
CONSTRAINT alembic_version_pkc PRIMARY KEY (version_num)
|
|
)
|
|
""")
|
|
|
|
await conn.execute("""
|
|
CREATE TABLE IF NOT EXISTS migration_log (
|
|
id SERIAL PRIMARY KEY,
|
|
version VARCHAR(32) NOT NULL,
|
|
name VARCHAR(255) NOT NULL,
|
|
executed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
|
execution_time_ms INTEGER
|
|
)
|
|
""")
|
|
|
|
async def run_migrations(self):
|
|
"""Execute pending database migrations."""
|
|
|
|
conn = await asyncpg.connect(self.database_url)
|
|
|
|
try:
|
|
await self.ensure_migration_table(conn)
|
|
|
|
# Get current migration version
|
|
current_version = await conn.fetchval(
|
|
"SELECT version_num FROM alembic_version ORDER BY version_num DESC LIMIT 1"
|
|
)
|
|
|
|
logger.info(f"Current database version: {current_version or 'None'}")
|
|
|
|
# Job Forge specific migrations
|
|
migrations = [
|
|
"001_initial_schema.sql",
|
|
"002_add_rls_policies.sql",
|
|
"003_add_pgvector_extension.sql",
|
|
"004_add_application_indexes.sql",
|
|
"005_add_ai_generation_tracking.sql"
|
|
]
|
|
|
|
for migration_file in migrations:
|
|
migration_path = self.migrations_dir / migration_file
|
|
|
|
if not migration_path.exists():
|
|
logger.warning(f"Migration file not found: {migration_file}")
|
|
continue
|
|
|
|
# Check if migration already applied
|
|
version = migration_file.split('_')[0]
|
|
applied = await conn.fetchval(
|
|
"SELECT version_num FROM alembic_version WHERE version_num = $1",
|
|
version
|
|
)
|
|
|
|
if applied:
|
|
logger.info(f"Migration {migration_file} already applied")
|
|
continue
|
|
|
|
logger.info(f"Applying migration: {migration_file}")
|
|
start_time = datetime.now()
|
|
|
|
# Read and execute migration
|
|
sql = migration_path.read_text()
|
|
await conn.execute(sql)
|
|
|
|
# Record migration
|
|
execution_time = int((datetime.now() - start_time).total_seconds() * 1000)
|
|
await conn.execute(
|
|
"INSERT INTO alembic_version (version_num) VALUES ($1)",
|
|
version
|
|
)
|
|
await conn.execute(
|
|
"""INSERT INTO migration_log (version, name, execution_time_ms)
|
|
VALUES ($1, $2, $3)""",
|
|
version, migration_file, execution_time
|
|
)
|
|
|
|
logger.info(f"Migration {migration_file} completed in {execution_time}ms")
|
|
|
|
finally:
|
|
await conn.close()
|
|
|
|
# Migration runner script
|
|
async def main():
|
|
import os
|
|
database_url = os.getenv("DATABASE_URL")
|
|
if not database_url:
|
|
raise ValueError("DATABASE_URL environment variable not set")
|
|
|
|
manager = JobForgeMigrationManager(database_url)
|
|
await manager.run_migrations()
|
|
|
|
if __name__ == "__main__":
|
|
asyncio.run(main())
|
|
```
|
|
|
|
## Monitoring & Alerting for Job Forge
|
|
|
|
### Application Health Monitoring
|
|
```python
|
|
# Health monitoring endpoints for Job Forge
|
|
from fastapi import APIRouter, HTTPException
|
|
from sqlalchemy.ext.asyncio import AsyncSession
|
|
from app.core.database import get_db
|
|
from app.services.ai.claude_service import ClaudeService
|
|
from app.services.ai.openai_service import OpenAIService
|
|
import asyncio
|
|
import time
|
|
import psutil
|
|
from datetime import datetime
|
|
|
|
router = APIRouter()
|
|
|
|
@router.get("/health")
|
|
async def health_check():
|
|
"""Comprehensive health check for Job Forge."""
|
|
|
|
health_status = {
|
|
"status": "healthy",
|
|
"timestamp": datetime.utcnow().isoformat(),
|
|
"version": "1.0.0",
|
|
"services": {}
|
|
}
|
|
|
|
checks = []
|
|
|
|
# Database health check
|
|
checks.append(check_database_health())
|
|
|
|
# AI services health check
|
|
checks.append(check_ai_services_health())
|
|
|
|
# System resources check
|
|
checks.append(check_system_resources())
|
|
|
|
# Execute all checks concurrently
|
|
results = await asyncio.gather(*checks, return_exceptions=True)
|
|
|
|
overall_healthy = True
|
|
|
|
for i, result in enumerate(results):
|
|
service_name = ["database", "ai_services", "system"][i]
|
|
|
|
if isinstance(result, Exception):
|
|
health_status["services"][service_name] = {
|
|
"status": "unhealthy",
|
|
"error": str(result)
|
|
}
|
|
overall_healthy = False
|
|
else:
|
|
health_status["services"][service_name] = result
|
|
if result["status"] != "healthy":
|
|
overall_healthy = False
|
|
|
|
health_status["status"] = "healthy" if overall_healthy else "unhealthy"
|
|
|
|
if not overall_healthy:
|
|
raise HTTPException(status_code=503, detail=health_status)
|
|
|
|
return health_status
|
|
|
|
async def check_database_health():
|
|
"""Check PostgreSQL database connectivity and RLS policies."""
|
|
|
|
start_time = time.time()
|
|
|
|
try:
|
|
# Test basic connectivity
|
|
async with get_db() as db:
|
|
await db.execute("SELECT 1")
|
|
|
|
# Test RLS policies are working
|
|
await db.execute("SELECT current_setting('app.current_user_id', true)")
|
|
|
|
# Check pgvector extension
|
|
result = await db.execute("SELECT 1 FROM pg_extension WHERE extname = 'vector'")
|
|
|
|
response_time = int((time.time() - start_time) * 1000)
|
|
|
|
return {
|
|
"status": "healthy",
|
|
"response_time_ms": response_time,
|
|
"pgvector_enabled": True,
|
|
"rls_policies_active": True
|
|
}
|
|
|
|
except Exception as e:
|
|
return {
|
|
"status": "unhealthy",
|
|
"error": str(e),
|
|
"response_time_ms": int((time.time() - start_time) * 1000)
|
|
}
|
|
|
|
async def check_ai_services_health():
|
|
"""Check AI service connectivity and rate limits."""
|
|
|
|
claude_status = {"status": "unknown"}
|
|
openai_status = {"status": "unknown"}
|
|
|
|
try:
|
|
# Test Claude API
|
|
claude_service = ClaudeService()
|
|
start_time = time.time()
|
|
|
|
# Simple test call
|
|
test_response = await claude_service.test_connection()
|
|
claude_response_time = int((time.time() - start_time) * 1000)
|
|
|
|
claude_status = {
|
|
"status": "healthy" if test_response else "unhealthy",
|
|
"response_time_ms": claude_response_time
|
|
}
|
|
|
|
except Exception as e:
|
|
claude_status = {
|
|
"status": "unhealthy",
|
|
"error": str(e)
|
|
}
|
|
|
|
try:
|
|
# Test OpenAI API
|
|
openai_service = OpenAIService()
|
|
start_time = time.time()
|
|
|
|
test_response = await openai_service.test_connection()
|
|
openai_response_time = int((time.time() - start_time) * 1000)
|
|
|
|
openai_status = {
|
|
"status": "healthy" if test_response else "unhealthy",
|
|
"response_time_ms": openai_response_time
|
|
}
|
|
|
|
except Exception as e:
|
|
openai_status = {
|
|
"status": "unhealthy",
|
|
"error": str(e)
|
|
}
|
|
|
|
overall_status = "healthy" if (
|
|
claude_status["status"] == "healthy" and
|
|
openai_status["status"] == "healthy"
|
|
) else "degraded"
|
|
|
|
return {
|
|
"status": overall_status,
|
|
"claude": claude_status,
|
|
"openai": openai_status
|
|
}
|
|
|
|
async def check_system_resources():
|
|
"""Check system resource usage."""
|
|
|
|
try:
|
|
cpu_percent = psutil.cpu_percent(interval=1)
|
|
memory = psutil.virtual_memory()
|
|
disk = psutil.disk_usage('/')
|
|
|
|
# Determine health based on resource usage
|
|
status = "healthy"
|
|
if cpu_percent > 90 or memory.percent > 90 or disk.percent > 90:
|
|
status = "warning"
|
|
if cpu_percent > 95 or memory.percent > 95 or disk.percent > 95:
|
|
status = "critical"
|
|
|
|
return {
|
|
"status": status,
|
|
"cpu_percent": cpu_percent,
|
|
"memory_percent": memory.percent,
|
|
"disk_percent": disk.percent,
|
|
"memory_available_gb": round(memory.available / (1024**3), 2),
|
|
"disk_free_gb": round(disk.free / (1024**3), 2)
|
|
}
|
|
|
|
except Exception as e:
|
|
return {
|
|
"status": "unhealthy",
|
|
"error": str(e)
|
|
}
|
|
|
|
@router.get("/metrics")
|
|
async def get_metrics():
|
|
"""Get application metrics for monitoring."""
|
|
|
|
return {
|
|
"timestamp": datetime.utcnow().isoformat(),
|
|
"uptime_seconds": time.time() - start_time,
|
|
"version": "1.0.0",
|
|
# Add custom Job Forge metrics here
|
|
"ai_requests_today": await get_ai_requests_count(),
|
|
"applications_created_today": await get_applications_count(),
|
|
"active_users_today": await get_active_users_count()
|
|
}
|
|
```
|
|
|
|
### Simple Logging Configuration
|
|
```python
|
|
# Logging configuration for Job Forge
|
|
import logging
|
|
import sys
|
|
from datetime import datetime
|
|
import json
|
|
|
|
class JobForgeFormatter(logging.Formatter):
|
|
"""Custom formatter for Job Forge logs."""
|
|
|
|
def format(self, record):
|
|
log_entry = {
|
|
"timestamp": datetime.utcnow().isoformat(),
|
|
"level": record.levelname,
|
|
"logger": record.name,
|
|
"message": record.getMessage(),
|
|
"module": record.module,
|
|
"function": record.funcName,
|
|
"line": record.lineno
|
|
}
|
|
|
|
# Add exception info if present
|
|
if record.exc_info:
|
|
log_entry["exception"] = self.formatException(record.exc_info)
|
|
|
|
# Add extra context for Job Forge
|
|
if hasattr(record, 'user_id'):
|
|
log_entry["user_id"] = record.user_id
|
|
if hasattr(record, 'request_id'):
|
|
log_entry["request_id"] = record.request_id
|
|
if hasattr(record, 'ai_service'):
|
|
log_entry["ai_service"] = record.ai_service
|
|
|
|
return json.dumps(log_entry)
|
|
|
|
def setup_logging():
|
|
"""Configure logging for Job Forge."""
|
|
|
|
# Root logger configuration
|
|
root_logger = logging.getLogger()
|
|
root_logger.setLevel(logging.INFO)
|
|
|
|
# Console handler
|
|
console_handler = logging.StreamHandler(sys.stdout)
|
|
console_handler.setFormatter(JobForgeFormatter())
|
|
root_logger.addHandler(console_handler)
|
|
|
|
# File handler for persistent logs
|
|
file_handler = logging.FileHandler('/var/log/jobforge/app.log')
|
|
file_handler.setFormatter(JobForgeFormatter())
|
|
root_logger.addHandler(file_handler)
|
|
|
|
# Set specific log levels
|
|
logging.getLogger("uvicorn").setLevel(logging.INFO)
|
|
logging.getLogger("sqlalchemy").setLevel(logging.WARNING)
|
|
logging.getLogger("asyncio").setLevel(logging.WARNING)
|
|
|
|
# Job Forge specific loggers
|
|
logging.getLogger("jobforge.ai").setLevel(logging.INFO)
|
|
logging.getLogger("jobforge.auth").setLevel(logging.INFO)
|
|
logging.getLogger("jobforge.database").setLevel(logging.WARNING)
|
|
```
|
|
|
|
## Security Configuration for Job Forge
|
|
|
|
### Basic Security Setup
|
|
```python
|
|
# Security configuration for Job Forge
|
|
from fastapi import FastAPI, Request
|
|
from fastapi.middleware.cors import CORSMiddleware
|
|
from fastapi.middleware.trustedhost import TrustedHostMiddleware
|
|
from slowapi import Limiter, _rate_limit_exceeded_handler
|
|
from slowapi.util import get_remote_address
|
|
from slowapi.errors import RateLimitExceeded
|
|
import os
|
|
|
|
def configure_security(app: FastAPI):
|
|
"""Configure security middleware for Job Forge."""
|
|
|
|
# Rate limiting
|
|
limiter = Limiter(key_func=get_remote_address)
|
|
app.state.limiter = limiter
|
|
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)
|
|
|
|
# CORS configuration
|
|
allowed_origins = os.getenv("CORS_ORIGINS", "http://localhost:3000").split(",")
|
|
|
|
app.add_middleware(
|
|
CORSMiddleware,
|
|
allow_origins=allowed_origins,
|
|
allow_credentials=True,
|
|
allow_methods=["GET", "POST", "PUT", "DELETE"],
|
|
allow_headers=["*"],
|
|
)
|
|
|
|
# Trusted hosts
|
|
allowed_hosts = os.getenv("ALLOWED_HOSTS", "localhost,127.0.0.1").split(",")
|
|
app.add_middleware(TrustedHostMiddleware, allowed_hosts=allowed_hosts)
|
|
|
|
# Security headers middleware
|
|
@app.middleware("http")
|
|
async def add_security_headers(request: Request, call_next):
|
|
response = await call_next(request)
|
|
|
|
# Security headers
|
|
response.headers["X-Content-Type-Options"] = "nosniff"
|
|
response.headers["X-Frame-Options"] = "DENY"
|
|
response.headers["X-XSS-Protection"] = "1; mode=block"
|
|
response.headers["Strict-Transport-Security"] = "max-age=31536000; includeSubDomains"
|
|
|
|
return response
|
|
```
|
|
|
|
## Backup Strategy for Job Forge
|
|
```bash
|
|
#!/bin/bash
|
|
# backup-jobforge.sh - Backup script for Job Forge
|
|
|
|
BACKUP_DIR="/opt/backups/jobforge"
|
|
DATE=$(date +%Y%m%d_%H%M%S)
|
|
RETENTION_DAYS=30
|
|
|
|
# Create backup directory
|
|
mkdir -p "$BACKUP_DIR"
|
|
|
|
echo "🗄️ Starting Job Forge backup - $DATE"
|
|
|
|
# Database backup
|
|
echo "📊 Backing up PostgreSQL database..."
|
|
pg_dump "$DATABASE_URL" | gzip > "$BACKUP_DIR/database_$DATE.sql.gz"
|
|
|
|
# Application files backup
|
|
echo "📁 Backing up application files..."
|
|
tar -czf "$BACKUP_DIR/app_files_$DATE.tar.gz" \
|
|
--exclude="*.log" \
|
|
--exclude="__pycache__" \
|
|
--exclude=".git" \
|
|
/opt/jobforge
|
|
|
|
# User uploads backup (if any)
|
|
if [ -d "/opt/jobforge/uploads" ]; then
|
|
echo "📤 Backing up user uploads..."
|
|
tar -czf "$BACKUP_DIR/uploads_$DATE.tar.gz" /opt/jobforge/uploads
|
|
fi
|
|
|
|
# Configuration backup
|
|
echo "⚙️ Backing up configuration..."
|
|
cp /opt/jobforge/.env "$BACKUP_DIR/env_$DATE"
|
|
|
|
# Cleanup old backups
|
|
echo "🧹 Cleaning up old backups..."
|
|
find "$BACKUP_DIR" -name "*.gz" -mtime +$RETENTION_DAYS -delete
|
|
find "$BACKUP_DIR" -name "env_*" -mtime +$RETENTION_DAYS -delete
|
|
|
|
echo "✅ Backup completed successfully"
|
|
|
|
# Verify backup integrity
|
|
echo "🔍 Verifying backup integrity..."
|
|
if gzip -t "$BACKUP_DIR/database_$DATE.sql.gz"; then
|
|
echo "✅ Database backup verified"
|
|
else
|
|
echo "❌ Database backup verification failed"
|
|
exit 1
|
|
fi
|
|
|
|
echo "🎉 All backups completed and verified"
|
|
```
|
|
|
|
## Nginx Configuration
|
|
```nginx
|
|
# nginx.conf for Job Forge
|
|
server {
|
|
listen 80;
|
|
server_name yourdomain.com www.yourdomain.com;
|
|
return 301 https://$server_name$request_uri;
|
|
}
|
|
|
|
server {
|
|
listen 443 ssl http2;
|
|
server_name yourdomain.com www.yourdomain.com;
|
|
|
|
ssl_certificate /etc/nginx/ssl/cert.pem;
|
|
ssl_certificate_key /etc/nginx/ssl/key.pem;
|
|
ssl_protocols TLSv1.2 TLSv1.3;
|
|
ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512;
|
|
|
|
client_max_body_size 10M;
|
|
|
|
# Job Forge FastAPI application
|
|
location / {
|
|
proxy_pass http://jobforge-app:8000;
|
|
proxy_set_header Host $host;
|
|
proxy_set_header X-Real-IP $remote_addr;
|
|
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
|
proxy_set_header X-Forwarded-Proto $scheme;
|
|
proxy_redirect off;
|
|
|
|
# Timeout settings for AI operations
|
|
proxy_connect_timeout 60s;
|
|
proxy_send_timeout 60s;
|
|
proxy_read_timeout 120s;
|
|
}
|
|
|
|
# Health check endpoint
|
|
location /health {
|
|
proxy_pass http://jobforge-app:8000/health;
|
|
access_log off;
|
|
}
|
|
|
|
# Static files (if any)
|
|
location /static/ {
|
|
alias /opt/jobforge/static/;
|
|
expires 30d;
|
|
add_header Cache-Control "public, immutable";
|
|
}
|
|
}
|
|
```
|
|
|
|
## Quick Troubleshooting for Job Forge
|
|
```bash
|
|
# troubleshoot-jobforge.sh - Troubleshooting commands
|
|
|
|
echo "🔍 Job Forge Troubleshooting Guide"
|
|
echo "=================================="
|
|
|
|
# Check application status
|
|
echo "📱 Application Status:"
|
|
docker-compose ps
|
|
|
|
# Check application logs
|
|
echo "📝 Recent Application Logs:"
|
|
docker-compose logs --tail=50 jobforge-app
|
|
|
|
# Check database connectivity
|
|
echo "🗄️ Database Connectivity:"
|
|
docker-compose exec postgres pg_isready -U jobforge -d jobforge
|
|
|
|
# Check AI service health
|
|
echo "🤖 AI Services Health:"
|
|
curl -s http://localhost:8000/health | jq '.services.ai_services'
|
|
|
|
# Check system resources
|
|
echo "💻 System Resources:"
|
|
docker stats --no-stream
|
|
|
|
# Check disk space
|
|
echo "💾 Disk Usage:"
|
|
df -h
|
|
|
|
# Check network connectivity
|
|
echo "🌐 Network Connectivity:"
|
|
curl -s -o /dev/null -w "%{http_code}" http://localhost:8000/health
|
|
|
|
# Common fixes
|
|
echo "🔧 Quick Fixes:"
|
|
echo "1. Restart application: docker-compose restart jobforge-app"
|
|
echo "2. Restart database: docker-compose restart postgres"
|
|
echo "3. View full logs: docker-compose logs -f"
|
|
echo "4. Rebuild containers: docker-compose up --build -d"
|
|
echo "5. Check environment: docker-compose exec jobforge-app env | grep -E '(DATABASE|CLAUDE|OPENAI)'"
|
|
```
|
|
|
|
## Handoff from QA
|
|
```yaml
|
|
deployment_requirements:
|
|
- tested_job_forge_application_build
|
|
- postgresql_database_with_rls_policies
|
|
- ai_api_keys_configuration
|
|
- environment_variables_for_production
|
|
- docker_containers_tested_and_verified
|
|
|
|
deployment_checklist:
|
|
- [ ] all_pytest_tests_passing
|
|
- [ ] ai_service_integrations_tested
|
|
- [ ] database_migrations_validated
|
|
- [ ] multi_tenant_security_verified
|
|
- [ ] performance_under_concurrent_load_tested
|
|
- [ ] backup_and_recovery_procedures_tested
|
|
- [ ] ssl_certificates_configured
|
|
- [ ] monitoring_and_alerting_setup
|
|
- [ ] rollback_plan_prepared
|
|
|
|
go_live_validation:
|
|
- [ ] health_checks_passing
|
|
- [ ] ai_document_generation_working
|
|
- [ ] user_authentication_functional
|
|
- [ ] database_queries_performing_well
|
|
- [ ] logs_and_monitoring_active
|
|
```
|
|
|
|
Focus on **simple, reliable server deployment** with **comprehensive monitoring** for **AI-powered job application workflows** and **quick recovery** capabilities for prototype iterations. |