initial setup

This commit is contained in:
2025-08-01 09:31:37 -04:00
parent 1426497df6
commit 2d1aa8280e
9 changed files with 5109 additions and 835 deletions

421
README.md
View File

@@ -1,3 +1,420 @@
# job-forge # ⚡ JobForge MVP
A tool to help with job applications. **AI-Powered Job Application Management System**
Transform your job search with intelligent document generation and strategic application management. JobForge leverages advanced AI to create tailored resumes and cover letters while streamlining your entire application workflow.
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Python](https://img.shields.io/badge/Python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![FastAPI](https://img.shields.io/badge/FastAPI-0.109+-green.svg)](https://fastapi.tiangolo.com/)
[![PostgreSQL](https://img.shields.io/badge/PostgreSQL-16+-blue.svg)](https://www.postgresql.org/)
[![Docker](https://img.shields.io/badge/Docker-Compose-blue.svg)](https://www.docker.com/)
---
## 🎯 Project Overview
### What is JobForge?
JobForge is an AI-powered job application management system designed to streamline and optimize the job search process. Built for individual job seekers, it combines strategic application management with advanced AI document generation to maximize your chances of landing interviews.
### Key Features (MVP)
**3-Phase AI Workflow**
- **Research Phase:** Automated job description analysis and company research
- **Resume Optimization:** Multi-resume synthesis tailored to specific job requirements
- **Cover Letter Generation:** Personalized cover letters with authentic voice preservation
🎨 **Modern Interface**
- Professional web application built with Dash + Mantine components
- Intuitive sidebar navigation and document management
- Real-time processing status and progress tracking
🔒 **Secure & Private**
- Complete user data isolation with PostgreSQL Row-Level Security
- Local document storage with full user control
- JWT-based authentication system
🤖 **AI-Powered Intelligence**
- Claude Sonnet 4 for document generation and analysis
- OpenAI embeddings for semantic document matching
- Vector database for intelligent insights and recommendations
---
## 🏗️ Architecture Overview
### Technology Stack
| Component | Technology | Purpose |
|-----------|------------|---------|
| **Frontend** | Dash + Mantine | Modern, responsive web interface |
| **Backend** | FastAPI + AsyncIO | High-performance REST API |
| **Database** | PostgreSQL 16 + pgvector | Data persistence with vector search |
| **AI Services** | Claude Sonnet 4, OpenAI | Document generation and analysis |
| **Development** | Docker Compose | Containerized development environment |
| **Authentication** | JWT + bcrypt | Secure user authentication |
### System Architecture
```mermaid
graph TB
subgraph "Frontend Layer"
UI[Dash + Mantine UI]
COMP[Reusable Components]
end
subgraph "Backend API"
API[FastAPI Rest API]
AUTH[JWT Authentication]
SERVICES[Business Services]
end
subgraph "AI Processing"
CLAUDE[Claude Sonnet 4]
OPENAI[OpenAI Embeddings]
AGENTS[AI Agents]
end
subgraph "Data Layer"
PG[(PostgreSQL + pgvector)]
REDIS[(Redis Cache)]
end
UI --> API
API --> AUTH
API --> SERVICES
SERVICES --> AGENTS
AGENTS --> CLAUDE
AGENTS --> OPENAI
SERVICES --> PG
API --> REDIS
```
---
## 🚀 Quick Start
### Prerequisites
- **Docker Desktop** 4.20+ with Docker Compose
- **Git** 2.30+
- **API Keys:** Claude API key, OpenAI API key
### 1. Clone & Setup
```bash
# Clone the repository
git clone https://github.com/your-org/jobforge-mvp.git
cd jobforge-mvp
# Copy environment template
cp .env.example .env
# Add your API keys to .env file
nano .env # Add CLAUDE_API_KEY and OPENAI_API_KEY
```
### 2. Start Development Environment
```bash
# Start all services (PostgreSQL, Backend, Frontend)
docker-compose up -d
# View logs to ensure everything started correctly
docker-compose logs -f
```
### 3. Access the Application
- **Frontend Application:** http://localhost:8501
- **Backend API:** http://localhost:8000
- **API Documentation:** http://localhost:8000/docs
### 4. Create Your First Application
1. Register a new account at http://localhost:8501
2. Upload your resume(s) to the resume library
3. Create a new job application with company details and job description
4. Watch the AI generate your research report, optimized resume, and cover letter
5. Edit and refine the generated documents as needed
---
## 📚 Documentation
### 📖 Core Documentation
| Document | Description | For |
|----------|-------------|-----|
| **[📋 MVP Architecture](docs/mvp-architecture.md)** | High-level system design and component overview | All team members |
| **[🔧 Development Setup](docs/development-setup.md)** | Complete environment setup with troubleshooting | Developers |
| **[🌿 Git Branch Strategy](docs/git-branch-strategy.md)** | Version control workflow and team coordination | All team members |
### 🛠️ Technical Documentation
| Document | Description | For |
|----------|-------------|-----|
| **[🔌 API Specification](docs/api-specification.md)** | Complete REST API documentation with examples | Backend developers |
| **[🗄️ Database Design](docs/database-design.md)** | Schema, security policies, and optimization | Backend developers |
| **[🧪 Testing Strategy](docs/testing-strategy.md)** | Testing guidelines and automation setup | All developers |
### 📝 Additional Resources
- **[📊 Project Roadmap](#roadmap)** - Development timeline and milestones
- **[🤝 Contributing Guidelines](#contributing)** - How to contribute to the project
- **[❓ FAQ](#faq)** - Common questions and answers
---
## 🔄 Development Workflow
### Branch Strategy
We use a **Git Flow** approach with the following branches:
- **`main`** - Production-ready code (protected)
- **`develop`** - Integration branch for completed features
- **`feature/*`** - Individual feature development
- **`hotfix/*`** - Emergency production fixes
- **`release/*`** - Release preparation and testing
**Example feature branch names:**
```bash
feature/backend-user-authentication
feature/frontend-application-sidebar
feature/ai-claude-integration
feature/database-rls-policies
```
See our **[Git Branch Strategy](docs/git-branch-strategy.md)** for detailed workflows.
### Development Process
1. **Start Feature:** Create branch from `develop`
2. **Implement:** Follow coding standards and write tests
3. **Test:** Ensure all tests pass and CI/CD checks succeed
4. **Review:** Submit PR with detailed description
5. **Merge:** Merge to `develop` after approval
6. **Deploy:** Automatic deployment to staging environment
---
## 🏃‍♂️ Project Status
### Current Phase: MVP Development
**Timeline:** 8 weeks (July - August 2025)
**Status:** 🚧 In Development
**Target:** Production-ready MVP for personal use and concept validation
### MVP Milestones
| Week | Milestone | Status |
|------|-----------|--------|
| **1-2** | Foundation & Infrastructure | 🚧 In Progress |
| **3-4** | User Authentication & Application CRUD | ⏳ Planned |
| **5-6** | AI Agents Integration | ⏳ Planned |
| **7-8** | Frontend Polish & Release | ⏳ Planned |
### Feature Completion
- [x] Project setup and documentation
- [x] Docker development environment
- [ ] User authentication system
- [ ] Application creation and management
- [ ] AI-powered research generation
- [ ] Resume optimization engine
- [ ] Cover letter generation
- [ ] Document editing interface
- [ ] Production deployment
---
## 🧪 Testing
### Testing Strategy
We maintain high code quality through comprehensive testing:
- **Unit Tests:** Business logic and services (80%+ coverage)
- **Integration Tests:** API endpoints and database interactions
- **Manual Testing:** Complete user workflows and edge cases
- **AI Mocking:** Reliable testing without external API dependencies
### Running Tests
```bash
# Run all tests
docker-compose exec backend pytest
# Run with coverage report
docker-compose exec backend pytest --cov=src --cov-report=html
# Run specific test file
docker-compose exec backend pytest tests/unit/services/test_auth_service.py
```
See **[Testing Strategy](docs/testing-strategy.md)** for detailed testing guidelines.
---
## 🚀 Deployment
### Development Environment
```bash
# Start development environment
docker-compose up -d
# View service logs
docker-compose logs -f [service_name]
# Stop environment
docker-compose down
```
### Production Deployment
Production deployment instructions will be added as we approach MVP completion. The current focus is on local development and testing.
---
## 🤝 Contributing
### Getting Started
1. **Read the Documentation:** Start with [Development Setup](docs/development-setup.md)
2. **Set Up Environment:** Follow the quick start guide above
3. **Choose a Task:** Check open issues or discuss new features
4. **Create Feature Branch:** Follow our [Git Branch Strategy](docs/git-branch-strategy.md)
5. **Submit Pull Request:** Include tests and documentation updates
### Development Standards
- **Code Style:** Black formatter, isort imports, type hints required
- **Testing:** Write tests for new functionality, maintain coverage
- **Documentation:** Update relevant docs for user-facing changes
- **Security:** Never commit API keys or sensitive data
### Pull Request Process
1. Create feature branch from `develop`
2. Implement changes with tests
3. Ensure all CI/CD checks pass
4. Submit PR with detailed description
5. Address code review feedback
6. Merge after approval
---
## 📊 Roadmap {#roadmap}
### Phase 1: MVP (Current - August 2025)
**Goal:** Production-ready job application management tool for personal use
**Key Features:**
- Complete 3-phase AI workflow
- Professional web interface
- Secure user authentication
- Document management and editing
### Phase 2: SaaS Platform (September 2025+)
**Goal:** Multi-tenant SaaS platform with subscription billing
**Planned Features:**
- Subscription management and billing
- Usage analytics and insights
- Advanced AI features and learning
- Post-application tracking (interviews, responses)
- Mobile application
### Phase 3: Advanced Features (Future)
**Goal:** Enterprise-grade job application platform
**Planned Features:**
- Multi-language support
- Integration with job boards and ATS systems
- Advanced analytics and success prediction
- White-label solutions for career coaches
---
## ❓ FAQ {#faq}
### General Questions
**Q: What makes JobForge different from other job application tools?**
A: JobForge combines AI-powered document generation with strategic application management. Unlike simple trackers, it actively helps create better applications using advanced AI analysis and multi-resume optimization.
**Q: Is JobForge free to use?**
A: The MVP is designed for personal use and concept validation. Future SaaS plans will include both free and paid tiers with different feature sets.
**Q: What AI models does JobForge use?**
A: We use Claude Sonnet 4 for document generation and analysis, plus OpenAI embeddings for semantic search and document matching.
### Technical Questions
**Q: Can I run JobForge without Docker?**
A: While possible, Docker is strongly recommended for consistent development environments. Manual setup instructions may be added in the future.
**Q: How secure is my job application data?**
A: Very secure. We use PostgreSQL Row-Level Security for complete user data isolation, JWT authentication, and all sensitive data is encrypted at rest.
**Q: Can I contribute to JobForge development?**
A: Yes! Check our [Contributing Guidelines](#contributing) and [Development Setup](docs/development-setup.md) to get started.
### Development Questions
**Q: What's the recommended development workflow?**
A: Follow our [Git Branch Strategy](docs/git-branch-strategy.md) - create feature branches from `develop`, implement with tests, submit PRs for review.
**Q: How do I add a new API endpoint?**
A: See our [API Specification](docs/api-specification.md) for examples and patterns, then follow the testing guidelines in [Testing Strategy](docs/testing-strategy.md).
**Q: Where can I find the database schema?**
A: Complete schema documentation is in [Database Design](docs/database-design.md) including security policies and performance optimization.
---
## 📄 License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
---
## 🙏 Acknowledgments
- **Claude Sonnet 4** by Anthropic for advanced AI document generation
- **OpenAI** for embedding models and semantic search capabilities
- **FastAPI** community for the excellent async web framework
- **Dash** and **Plotly** teams for the modern Python web framework
- **PostgreSQL** and **pgvector** for robust data storage and vector search
---
## 📞 Support & Contact
### Development Team
For development questions, bug reports, or feature requests:
- **Issues:** Use GitHub/Gitea issues for bug reports and feature requests
- **Discussions:** Use GitHub/Gitea discussions for general questions
- **Documentation:** Check the [docs](docs/) folder for detailed guides
### Getting Help
1. **Check Documentation:** Most questions are answered in our comprehensive docs
2. **Search Issues:** Look for existing issues or discussions
3. **Ask Questions:** Create new discussions for general questions
4. **Report Bugs:** Use issue templates for bug reports
---
**Made with ❤️ for job seekers everywhere**
*Transform your job search. Forge your path to success.*

597
docs/api_specification.md Normal file
View File

@@ -0,0 +1,597 @@
# JobForge MVP - API Specification
**Version:** 1.0.0 MVP
**Base URL:** `http://localhost:8000`
**Target Audience:** Backend Developers
**Last Updated:** July 2025
---
## 🔐 Authentication
### Overview
- **Method:** JWT Bearer tokens
- **Token Expiry:** 24 hours
- **Refresh:** Not implemented in MVP (re-login required)
- **Header Format:** `Authorization: Bearer <jwt_token>`
### Authentication Endpoints
#### POST /api/v1/auth/register
Register new user account.
**Request:**
```json
{
"email": "user@example.com",
"password": "SecurePass123!",
"full_name": "John Doe"
}
```
**Response (201):**
```json
{
"id": "123e4567-e89b-12d3-a456-426614174000",
"email": "user@example.com",
"full_name": "John Doe",
"access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
"token_type": "bearer"
}
```
**Errors:**
- `400` - Invalid email format or weak password
- `409` - Email already registered
#### POST /api/v1/auth/login
Authenticate user and return JWT token.
**Request:**
```json
{
"email": "user@example.com",
"password": "SecurePass123!"
}
```
**Response (200):**
```json
{
"id": "123e4567-e89b-12d3-a456-426614174000",
"email": "user@example.com",
"full_name": "John Doe",
"access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
"token_type": "bearer"
}
```
**Errors:**
- `401` - Invalid credentials
- `400` - Missing email or password
#### GET /api/v1/auth/me
Get current user profile (requires authentication).
**Headers:** `Authorization: Bearer <token>`
**Response (200):**
```json
{
"id": "123e4567-e89b-12d3-a456-426614174000",
"email": "user@example.com",
"full_name": "John Doe",
"created_at": "2025-07-01T10:00:00Z"
}
```
**Errors:**
- `401` - Invalid or expired token
---
## 📋 Applications API
### Application Model
```json
{
"id": "123e4567-e89b-12d3-a456-426614174000",
"name": "google_senior_developer_2025_07_01",
"company_name": "Google",
"role_title": "Senior Developer",
"job_url": "https://careers.google.com/jobs/123",
"job_description": "We are looking for...",
"location": "Toronto, ON",
"priority_level": "high",
"status": "draft",
"research_completed": false,
"resume_optimized": false,
"cover_letter_generated": false,
"created_at": "2025-07-01T10:00:00Z",
"updated_at": "2025-07-01T10:00:00Z"
}
```
### Application Endpoints
#### POST /api/v1/applications
Create new job application.
**Headers:** `Authorization: Bearer <token>`
**Request:**
```json
{
"company_name": "Google",
"role_title": "Senior Developer",
"job_description": "We are looking for an experienced developer...",
"job_url": "https://careers.google.com/jobs/123",
"location": "Toronto, ON",
"priority_level": "high",
"additional_context": "Found through LinkedIn, know someone there"
}
```
**Response (201):**
```json
{
"id": "123e4567-e89b-12d3-a456-426614174000",
"name": "google_senior_developer_2025_07_01",
"company_name": "Google",
"role_title": "Senior Developer",
"job_url": "https://careers.google.com/jobs/123",
"job_description": "We are looking for an experienced developer...",
"location": "Toronto, ON",
"priority_level": "high",
"status": "draft",
"research_completed": false,
"resume_optimized": false,
"cover_letter_generated": false,
"created_at": "2025-07-01T10:00:00Z",
"updated_at": "2025-07-01T10:00:00Z"
}
```
**Validation Rules:**
- `company_name`: Required, 1-255 characters
- `role_title`: Required, 1-255 characters
- `job_description`: Required, minimum 50 characters
- `job_url`: Optional, valid URL format
- `priority_level`: Optional, enum: `low|medium|high`
**Errors:**
- `400` - Validation errors
- `401` - Unauthorized
#### GET /api/v1/applications
List user's applications.
**Headers:** `Authorization: Bearer <token>`
**Query Parameters:**
- `status`: Filter by status (optional)
- `priority`: Filter by priority level (optional)
- `limit`: Number of results (default: 50, max: 100)
- `offset`: Pagination offset (default: 0)
**Response (200):**
```json
{
"applications": [
{
"id": "123e4567-e89b-12d3-a456-426614174000",
"name": "google_senior_developer_2025_07_01",
"company_name": "Google",
"role_title": "Senior Developer",
"status": "research_complete",
"priority_level": "high",
"research_completed": true,
"resume_optimized": false,
"cover_letter_generated": false,
"created_at": "2025-07-01T10:00:00Z",
"updated_at": "2025-07-01T11:30:00Z"
}
],
"total": 1,
"limit": 50,
"offset": 0
}
```
#### GET /api/v1/applications/{application_id}
Get specific application details.
**Headers:** `Authorization: Bearer <token>`
**Response (200):** Full application object (see Application Model above)
**Errors:**
- `404` - Application not found or not owned by user
- `401` - Unauthorized
#### PUT /api/v1/applications/{application_id}
Update application details.
**Headers:** `Authorization: Bearer <token>`
**Request:**
```json
{
"company_name": "Google Inc.",
"location": "Toronto, ON, Canada",
"priority_level": "medium"
}
```
**Response (200):** Updated application object
**Errors:**
- `404` - Application not found
- `400` - Validation errors
- `401` - Unauthorized
#### DELETE /api/v1/applications/{application_id}
Delete application and all associated documents.
**Headers:** `Authorization: Bearer <token>`
**Response (204):** No content
**Errors:**
- `404` - Application not found
- `401` - Unauthorized
---
## 📄 Documents API
### Document Model
```json
{
"id": "123e4567-e89b-12d3-a456-426614174000",
"application_id": "456e7890-e89b-12d3-a456-426614174000",
"document_type": "research_report",
"content": "# Research Report\n\n## Job Analysis\n...",
"created_at": "2025-07-01T10:30:00Z",
"updated_at": "2025-07-01T10:30:00Z"
}
```
### Document Endpoints
#### GET /api/v1/applications/{application_id}/documents
Get all documents for an application.
**Headers:** `Authorization: Bearer <token>`
**Response (200):**
```json
{
"research_report": {
"id": "123e4567-e89b-12d3-a456-426614174000",
"content": "# Research Report\n\n## Job Analysis\n...",
"created_at": "2025-07-01T10:30:00Z",
"updated_at": "2025-07-01T10:30:00Z"
},
"optimized_resume": {
"id": "234e5678-e89b-12d3-a456-426614174000",
"content": "# John Doe\n\n## Experience\n...",
"created_at": "2025-07-01T11:00:00Z",
"updated_at": "2025-07-01T11:00:00Z"
},
"cover_letter": null
}
```
#### GET /api/v1/applications/{application_id}/documents/{document_type}
Get specific document.
**Headers:** `Authorization: Bearer <token>`
**URL Parameters:**
- `document_type`: enum: `research_report|optimized_resume|cover_letter`
**Response (200):**
```json
{
"id": "123e4567-e89b-12d3-a456-426614174000",
"application_id": "456e7890-e89b-12d3-a456-426614174000",
"document_type": "research_report",
"content": "# Research Report\n\n## Job Analysis\n...",
"created_at": "2025-07-01T10:30:00Z",
"updated_at": "2025-07-01T10:30:00Z"
}
```
**Errors:**
- `404` - Document not found or application not owned by user
#### PUT /api/v1/applications/{application_id}/documents/{document_type}
Update document content (user editing).
**Headers:** `Authorization: Bearer <token>`
**Request:**
```json
{
"content": "# Updated Research Report\n\n## Job Analysis\nUpdated content..."
}
```
**Response (200):** Updated document object
**Validation:**
- `content`: Required, minimum 10 characters
**Errors:**
- `404` - Document or application not found
- `400` - Validation errors
---
## 🤖 AI Processing API
### Processing Status Model
```json
{
"application_id": "123e4567-e89b-12d3-a456-426614174000",
"current_phase": "research",
"status": "processing",
"progress": 0.6,
"estimated_completion": "2025-07-01T10:35:00Z",
"error_message": null
}
```
### Processing Endpoints
#### POST /api/v1/processing/applications/{application_id}/research
Start research phase processing.
**Headers:** `Authorization: Bearer <token>`
**Response (202):**
```json
{
"message": "Research phase started",
"application_id": "123e4567-e89b-12d3-a456-426614174000",
"estimated_completion": "2025-07-01T10:35:00Z"
}
```
**Errors:**
- `404` - Application not found
- `409` - Research already completed
- `400` - Application not in correct state
#### POST /api/v1/processing/applications/{application_id}/resume
Start resume optimization phase.
**Headers:** `Authorization: Bearer <token>`
**Requirements:** Research phase must be completed
**Response (202):**
```json
{
"message": "Resume optimization started",
"application_id": "123e4567-e89b-12d3-a456-426614174000",
"estimated_completion": "2025-07-01T11:05:00Z"
}
```
**Errors:**
- `404` - Application not found
- `409` - Resume already optimized
- `412` - Research phase not completed
#### POST /api/v1/processing/applications/{application_id}/cover-letter
Start cover letter generation phase.
**Headers:** `Authorization: Bearer <token>`
**Request:**
```json
{
"additional_context": "I'm particularly interested in their AI/ML projects. I have experience with TensorFlow and PyTorch."
}
```
**Requirements:** Resume optimization must be completed
**Response (202):**
```json
{
"message": "Cover letter generation started",
"application_id": "123e4567-e89b-12d3-a456-426614174000",
"estimated_completion": "2025-07-01T11:15:00Z"
}
```
**Errors:**
- `404` - Application not found
- `409` - Cover letter already generated
- `412` - Resume optimization not completed
#### GET /api/v1/processing/applications/{application_id}/status
Get current processing status.
**Headers:** `Authorization: Bearer <token>`
**Response (200):**
```json
{
"application_id": "123e4567-e89b-12d3-a456-426614174000",
"current_phase": "resume",
"status": "completed",
"progress": 1.0,
"completed_at": "2025-07-01T11:05:00Z",
"error_message": null
}
```
**Status Values:**
- `idle` - No processing active
- `processing` - AI generation in progress
- `completed` - Phase completed successfully
- `failed` - Processing failed with error
---
## 👤 User Resumes API
### Resume Model
```json
{
"id": "123e4567-e89b-12d3-a456-426614174000",
"name": "Technical Resume",
"content": "# John Doe\n\n## Technical Skills\n...",
"focus_area": "software_development",
"is_primary": true,
"created_at": "2025-07-01T09:00:00Z",
"updated_at": "2025-07-01T09:00:00Z"
}
```
### Resume Endpoints
#### GET /api/v1/resumes
Get user's resume library.
**Headers:** `Authorization: Bearer <token>`
**Response (200):**
```json
{
"resumes": [
{
"id": "123e4567-e89b-12d3-a456-426614174000",
"name": "Technical Resume",
"focus_area": "software_development",
"is_primary": true,
"created_at": "2025-07-01T09:00:00Z",
"updated_at": "2025-07-01T09:00:00Z"
}
]
}
```
#### POST /api/v1/resumes
Upload new resume to library.
**Headers:** `Authorization: Bearer <token>`
**Request:**
```json
{
"name": "Management Resume",
"content": "# John Doe\n\n## Leadership Experience\n...",
"focus_area": "management",
"is_primary": false
}
```
**Response (201):** Created resume object
**Validation:**
- `name`: Required, 1-255 characters
- `content`: Required, minimum 100 characters
- `focus_area`: Optional, enum: `software_development|management|data_science|consulting|other`
#### GET /api/v1/resumes/{resume_id}
Get specific resume details.
**Headers:** `Authorization: Bearer <token>`
**Response (200):** Full resume object
#### PUT /api/v1/resumes/{resume_id}
Update resume content.
**Headers:** `Authorization: Bearer <token>`
**Request:** Same as POST
**Response (200):** Updated resume object
#### DELETE /api/v1/resumes/{resume_id}
Delete resume from library.
**Headers:** `Authorization: Bearer <token>`
**Response (204):** No content
**Errors:**
- `409` - Cannot delete primary resume if it's the only one
---
## 🚨 Error Handling
### Standard Error Response
```json
{
"error": {
"code": "VALIDATION_ERROR",
"message": "Invalid input data",
"details": {
"company_name": ["This field is required"],
"job_description": ["Must be at least 50 characters"]
}
},
"timestamp": "2025-07-01T10:00:00Z",
"path": "/api/v1/applications"
}
```
### HTTP Status Codes
- `200` - Success
- `201` - Created successfully
- `202` - Accepted (async processing started)
- `204` - No content (successful deletion)
- `400` - Bad request (validation errors)
- `401` - Unauthorized (invalid/missing token)
- `403` - Forbidden (valid token, insufficient permissions)
- `404` - Not found
- `409` - Conflict (duplicate email, invalid state transition)
- `412` - Precondition failed (phase not completed)
- `422` - Unprocessable entity (semantic errors)
- `500` - Internal server error
### Error Codes
- `VALIDATION_ERROR` - Input validation failed
- `AUTHENTICATION_ERROR` - Invalid credentials
- `AUTHORIZATION_ERROR` - Insufficient permissions
- `NOT_FOUND` - Resource not found
- `DUPLICATE_RESOURCE` - Resource already exists
- `INVALID_STATE` - Operation not valid for current state
- `EXTERNAL_API_ERROR` - Claude/OpenAI API error
- `PROCESSING_ERROR` - AI processing failed
---
## 🔧 Development Notes
### Rate Limiting (Future)
- Not implemented in MVP
- Will be added in Phase 2 for SaaS
### Pagination
- Default limit: 50
- Maximum limit: 100
- Use `offset` for pagination
### Content Validation
- Job description: 50-10000 characters
- Resume content: 100-50000 characters
- Names: 1-255 characters
- URLs: Valid HTTP/HTTPS format
### Background Processing
- AI operations run asynchronously
- Use `/processing/applications/{id}/status` to check progress
- Frontend should poll every 2-3 seconds during processing
---
*This API specification covers all endpoints required for MVP implementation. Use the OpenAPI documentation at `/docs` for interactive testing during development.*

651
docs/database_design.md Normal file
View File

@@ -0,0 +1,651 @@
# JobForge MVP - Database Design & Schema
**Version:** 1.0.0 MVP
**Database:** PostgreSQL 16 with pgvector
**Target Audience:** Backend Developers
**Last Updated:** July 2025
---
## 🎯 Database Overview
### Technology Stack
- **Database:** PostgreSQL 16
- **Extensions:** pgvector (for AI embeddings)
- **Security:** Row Level Security (RLS) for multi-tenancy
- **Connection:** AsyncPG with SQLAlchemy 2.0
- **Migrations:** Direct SQL for MVP (Alembic in Phase 2)
### Design Principles
- **User Isolation:** Complete data separation between users
- **Data Integrity:** Foreign key constraints and validation
- **Performance:** Optimized indexes for common queries
- **Security:** RLS policies prevent cross-user data access
- **Scalability:** Schema designed for future SaaS features
---
## 📊 Entity Relationship Diagram
```mermaid
erDiagram
USERS ||--o{ APPLICATIONS : creates
USERS ||--o{ USER_RESUMES : owns
APPLICATIONS ||--o{ DOCUMENTS : contains
DOCUMENTS ||--o| DOCUMENT_EMBEDDINGS : has_embedding
USERS {
uuid id PK
varchar email UK
varchar password_hash
varchar full_name
timestamp created_at
timestamp updated_at
}
APPLICATIONS {
uuid id PK
uuid user_id FK
varchar name
varchar company_name
varchar role_title
text job_url
text job_description
varchar location
varchar priority_level
varchar status
boolean research_completed
boolean resume_optimized
boolean cover_letter_generated
timestamp created_at
timestamp updated_at
}
DOCUMENTS {
uuid id PK
uuid application_id FK
varchar document_type
text content
timestamp created_at
timestamp updated_at
}
USER_RESUMES {
uuid id PK
uuid user_id FK
varchar name
text content
varchar focus_area
boolean is_primary
timestamp created_at
timestamp updated_at
}
DOCUMENT_EMBEDDINGS {
uuid id PK
uuid document_id FK
vector embedding
timestamp created_at
}
```
---
## 🗄️ Complete Database Schema
### Database Initialization
```sql
-- Enable required extensions
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE EXTENSION IF NOT EXISTS vector;
-- Create custom types
CREATE TYPE priority_level_type AS ENUM ('low', 'medium', 'high');
CREATE TYPE application_status_type AS ENUM (
'draft',
'research_complete',
'resume_ready',
'cover_letter_ready'
);
CREATE TYPE document_type_enum AS ENUM (
'research_report',
'optimized_resume',
'cover_letter'
);
CREATE TYPE focus_area_type AS ENUM (
'software_development',
'data_science',
'management',
'consulting',
'other'
);
```
### Core Tables
#### Users Table
```sql
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
email VARCHAR(255) UNIQUE NOT NULL,
password_hash VARCHAR(255) NOT NULL,
full_name VARCHAR(255) NOT NULL,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
-- Constraints
CONSTRAINT email_format CHECK (email ~* '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$'),
CONSTRAINT name_not_empty CHECK (LENGTH(TRIM(full_name)) > 0)
);
-- Indexes
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_users_created_at ON users(created_at);
-- Row Level Security
ALTER TABLE users ENABLE ROW LEVEL SECURITY;
-- Users can only see their own record
CREATE POLICY users_own_data ON users
FOR ALL
USING (id = current_setting('app.current_user_id')::UUID);
```
#### Applications Table
```sql
CREATE TABLE applications (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
name VARCHAR(255) NOT NULL,
company_name VARCHAR(255) NOT NULL,
role_title VARCHAR(255) NOT NULL,
job_url TEXT,
job_description TEXT NOT NULL,
location VARCHAR(255),
priority_level priority_level_type DEFAULT 'medium',
status application_status_type DEFAULT 'draft',
-- Phase tracking
research_completed BOOLEAN DEFAULT FALSE,
resume_optimized BOOLEAN DEFAULT FALSE,
cover_letter_generated BOOLEAN DEFAULT FALSE,
-- Timestamps
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
-- Constraints
CONSTRAINT job_description_min_length CHECK (LENGTH(job_description) >= 50),
CONSTRAINT company_name_not_empty CHECK (LENGTH(TRIM(company_name)) > 0),
CONSTRAINT role_title_not_empty CHECK (LENGTH(TRIM(role_title)) > 0),
CONSTRAINT valid_job_url CHECK (
job_url IS NULL OR
job_url ~* '^https?://[^\s/$.?#].[^\s]*$'
),
-- Business logic constraints
CONSTRAINT resume_requires_research CHECK (
NOT resume_optimized OR research_completed
),
CONSTRAINT cover_letter_requires_resume CHECK (
NOT cover_letter_generated OR resume_optimized
)
);
-- Indexes
CREATE INDEX idx_applications_user_id ON applications(user_id);
CREATE INDEX idx_applications_status ON applications(status);
CREATE INDEX idx_applications_priority ON applications(priority_level);
CREATE INDEX idx_applications_created_at ON applications(created_at);
CREATE INDEX idx_applications_company_name ON applications(company_name);
-- Full text search index for job descriptions
CREATE INDEX idx_applications_job_description_fts
ON applications USING gin(to_tsvector('english', job_description));
-- Row Level Security
ALTER TABLE applications ENABLE ROW LEVEL SECURITY;
CREATE POLICY applications_user_access ON applications
FOR ALL
USING (user_id = current_setting('app.current_user_id')::UUID);
```
#### Documents Table
```sql
CREATE TABLE documents (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
application_id UUID NOT NULL REFERENCES applications(id) ON DELETE CASCADE,
document_type document_type_enum NOT NULL,
content TEXT NOT NULL,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
-- Constraints
CONSTRAINT content_min_length CHECK (LENGTH(content) >= 10),
CONSTRAINT unique_document_per_application UNIQUE (application_id, document_type)
);
-- Indexes
CREATE INDEX idx_documents_application_id ON documents(application_id);
CREATE INDEX idx_documents_type ON documents(document_type);
CREATE INDEX idx_documents_updated_at ON documents(updated_at);
-- Full text search index for document content
CREATE INDEX idx_documents_content_fts
ON documents USING gin(to_tsvector('english', content));
-- Row Level Security
ALTER TABLE documents ENABLE ROW LEVEL SECURITY;
CREATE POLICY documents_user_access ON documents
FOR ALL
USING (
application_id IN (
SELECT id FROM applications
WHERE user_id = current_setting('app.current_user_id')::UUID
)
);
```
#### User Resumes Table
```sql
CREATE TABLE user_resumes (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
name VARCHAR(255) NOT NULL,
content TEXT NOT NULL,
focus_area focus_area_type DEFAULT 'other',
is_primary BOOLEAN DEFAULT FALSE,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
-- Constraints
CONSTRAINT resume_name_not_empty CHECK (LENGTH(TRIM(name)) > 0),
CONSTRAINT resume_content_min_length CHECK (LENGTH(content) >= 100),
-- Only one primary resume per user
CONSTRAINT unique_primary_resume UNIQUE (user_id, is_primary)
DEFERRABLE INITIALLY DEFERRED
);
-- Indexes
CREATE INDEX idx_user_resumes_user_id ON user_resumes(user_id);
CREATE INDEX idx_user_resumes_focus_area ON user_resumes(focus_area);
CREATE INDEX idx_user_resumes_is_primary ON user_resumes(is_primary);
-- Full text search index for resume content
CREATE INDEX idx_user_resumes_content_fts
ON user_resumes USING gin(to_tsvector('english', content));
-- Row Level Security
ALTER TABLE user_resumes ENABLE ROW LEVEL SECURITY;
CREATE POLICY user_resumes_access ON user_resumes
FOR ALL
USING (user_id = current_setting('app.current_user_id')::UUID);
```
#### Document Embeddings Table (AI Features)
```sql
CREATE TABLE document_embeddings (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
document_id UUID NOT NULL REFERENCES documents(id) ON DELETE CASCADE,
embedding vector(1536), -- OpenAI text-embedding-3-large dimension
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
-- Constraints
CONSTRAINT unique_embedding_per_document UNIQUE (document_id)
);
-- Vector similarity index
CREATE INDEX idx_document_embeddings_vector
ON document_embeddings USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
-- Regular indexes
CREATE INDEX idx_document_embeddings_document_id ON document_embeddings(document_id);
-- Row Level Security
ALTER TABLE document_embeddings ENABLE ROW LEVEL SECURITY;
CREATE POLICY document_embeddings_access ON document_embeddings
FOR ALL
USING (
document_id IN (
SELECT d.id FROM documents d
JOIN applications a ON d.application_id = a.id
WHERE a.user_id = current_setting('app.current_user_id')::UUID
)
);
```
---
## 🔒 Security Policies
### Row Level Security Overview
All tables with user data have RLS enabled to ensure complete data isolation:
```sql
-- Function to get current user ID from session
CREATE OR REPLACE FUNCTION get_current_user_id()
RETURNS UUID AS $$
BEGIN
RETURN current_setting('app.current_user_id')::UUID;
EXCEPTION
WHEN others THEN
RETURN NULL;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;
-- Helper function to check if user owns application
CREATE OR REPLACE FUNCTION user_owns_application(app_id UUID)
RETURNS BOOLEAN AS $$
BEGIN
RETURN EXISTS (
SELECT 1 FROM applications
WHERE id = app_id
AND user_id = get_current_user_id()
);
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;
```
### Setting User Context
Backend must set user context for each request:
```python
# In FastAPI dependency
async def set_user_context(user: User = Depends(get_current_user)):
async with get_db_connection() as conn:
await conn.execute(
"SET LOCAL app.current_user_id = %s",
str(user.id)
)
return user
```
---
## 🚀 Database Functions
### Trigger Functions
```sql
-- Update timestamp trigger function
CREATE OR REPLACE FUNCTION update_updated_at_column()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = NOW();
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
-- Apply to all tables with updated_at
CREATE TRIGGER update_users_updated_at
BEFORE UPDATE ON users
FOR EACH ROW EXECUTE FUNCTION update_updated_at_column();
CREATE TRIGGER update_applications_updated_at
BEFORE UPDATE ON applications
FOR EACH ROW EXECUTE FUNCTION update_updated_at_column();
CREATE TRIGGER update_documents_updated_at
BEFORE UPDATE ON documents
FOR EACH ROW EXECUTE FUNCTION update_updated_at_column();
CREATE TRIGGER update_user_resumes_updated_at
BEFORE UPDATE ON user_resumes
FOR EACH ROW EXECUTE FUNCTION update_updated_at_column();
```
### Business Logic Functions
```sql
-- Generate application name
CREATE OR REPLACE FUNCTION generate_application_name(
p_company_name VARCHAR,
p_role_title VARCHAR
) RETURNS VARCHAR AS $$
DECLARE
clean_company VARCHAR;
clean_role VARCHAR;
date_suffix VARCHAR;
BEGIN
-- Clean and normalize names
clean_company := LOWER(REGEXP_REPLACE(p_company_name, '[^a-zA-Z0-9]', '_', 'g'));
clean_role := LOWER(REGEXP_REPLACE(p_role_title, '[^a-zA-Z0-9]', '_', 'g'));
date_suffix := TO_CHAR(NOW(), 'YYYY_MM_DD');
RETURN clean_company || '_' || clean_role || '_' || date_suffix;
END;
$$ LANGUAGE plpgsql;
-- Update application phases trigger
CREATE OR REPLACE FUNCTION update_application_phases()
RETURNS TRIGGER AS $$
BEGIN
-- Auto-update phase completion based on document existence
IF TG_OP = 'INSERT' OR TG_OP = 'UPDATE' THEN
UPDATE applications SET
research_completed = EXISTS (
SELECT 1 FROM documents
WHERE application_id = NEW.application_id
AND document_type = 'research_report'
),
resume_optimized = EXISTS (
SELECT 1 FROM documents
WHERE application_id = NEW.application_id
AND document_type = 'optimized_resume'
),
cover_letter_generated = EXISTS (
SELECT 1 FROM documents
WHERE application_id = NEW.application_id
AND document_type = 'cover_letter'
)
WHERE id = NEW.application_id;
-- Update status based on completion
UPDATE applications SET
status = CASE
WHEN cover_letter_generated THEN 'cover_letter_ready'
WHEN resume_optimized THEN 'resume_ready'
WHEN research_completed THEN 'research_complete'
ELSE 'draft'
END
WHERE id = NEW.application_id;
END IF;
RETURN COALESCE(NEW, OLD);
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER documents_update_phases
AFTER INSERT OR UPDATE OR DELETE ON documents
FOR EACH ROW EXECUTE FUNCTION update_application_phases();
```
---
## 📈 Performance Optimization
### Query Optimization
```sql
-- Most common query patterns with optimized indexes
-- 1. Get user applications (paginated)
-- Index: idx_applications_user_id, idx_applications_created_at
SELECT * FROM applications
WHERE user_id = $1
ORDER BY created_at DESC
LIMIT $2 OFFSET $3;
-- 2. Get application with documents
-- Index: idx_documents_application_id
SELECT a.*, d.document_type, d.content
FROM applications a
LEFT JOIN documents d ON a.id = d.application_id
WHERE a.id = $1 AND a.user_id = $2;
-- 3. Search applications by company/role
-- Index: idx_applications_company_name, full-text search
SELECT * FROM applications
WHERE user_id = $1
AND (
company_name ILIKE $2
OR role_title ILIKE $3
OR to_tsvector('english', job_description) @@ plainto_tsquery('english', $4)
)
ORDER BY created_at DESC;
```
### Connection Pooling
```python
# SQLAlchemy async engine configuration
engine = create_async_engine(
DATABASE_URL,
pool_size=20, # Connection pool size
max_overflow=30, # Additional connections beyond pool_size
pool_pre_ping=True, # Validate connections before use
pool_recycle=3600, # Recycle connections every hour
echo=False # Disable SQL logging in production
)
```
---
## 🧪 Test Data Setup
### Development Seed Data
```sql
-- Insert test user (password: "testpass123")
INSERT INTO users (id, email, password_hash, full_name) VALUES (
'123e4567-e89b-12d3-a456-426614174000',
'test@example.com',
'$2b$12$LQv3c1yqBWVHxkd0LHAkCOYz6TtxMQJqhN8/LewgdyN8yF5V4M2kq',
'Test User'
);
-- Insert test resume
INSERT INTO user_resumes (user_id, name, content, focus_area, is_primary) VALUES (
'123e4567-e89b-12d3-a456-426614174000',
'Software Developer Resume',
'# Test User\n\n## Experience\n\nSoftware Developer at Tech Corp...',
'software_development',
true
);
-- Insert test application
INSERT INTO applications (
user_id, name, company_name, role_title,
job_description, status, research_completed
) VALUES (
'123e4567-e89b-12d3-a456-426614174000',
'google_senior_developer_2025_07_01',
'Google',
'Senior Developer',
'We are seeking an experienced software developer to join our team...',
'research_complete',
true
);
```
---
## 🔄 Database Migrations (Future)
### Migration Strategy for Phase 2
When adding Alembic migrations:
```python
# alembic/env.py configuration for RLS
from sqlalchemy import text
def run_migrations_online():
# Set up RLS context for migrations
with engine.connect() as connection:
connection.execute(text("SET row_security = off"))
context.configure(
connection=connection,
target_metadata=target_metadata,
compare_type=True,
compare_server_default=True
)
with context.begin_transaction():
context.run_migrations()
```
### Planned Schema Changes
- **Usage tracking tables** for SaaS billing
- **Subscription management** tables
- **Audit log** tables for compliance
- **Performance metrics** tables
- **Additional indexes** based on production usage
---
## 🛠️ Database Maintenance
### Regular Maintenance Tasks
```sql
-- Vacuum and analyze (run weekly)
VACUUM ANALYZE;
-- Update table statistics
ANALYZE applications;
ANALYZE documents;
ANALYZE user_resumes;
-- Check index usage
SELECT schemaname, tablename, indexname, idx_tup_read, idx_tup_fetch
FROM pg_stat_user_indexes
ORDER BY idx_tup_read DESC;
-- Monitor vector index performance
SELECT * FROM pg_stat_user_indexes
WHERE indexname LIKE '%vector%';
```
### Backup Strategy
```bash
# Daily backup script
pg_dump -h localhost -U jobforge_user -d jobforge_mvp \
--clean --if-exists --verbose \
> backup_$(date +%Y%m%d).sql
# Restore from backup
psql -h localhost -U jobforge_user -d jobforge_mvp < backup_20250701.sql
```
---
## 📊 Monitoring Queries
### Performance Monitoring
```sql
-- Slow queries
SELECT query, mean_time, calls, total_time
FROM pg_stat_statements
WHERE mean_time > 100 -- queries slower than 100ms
ORDER BY mean_time DESC;
-- Table sizes
SELECT
schemaname,
tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as size,
pg_total_relation_size(schemaname||'.'||tablename) as size_bytes
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY size_bytes DESC;
-- Connection counts
SELECT state, count(*)
FROM pg_stat_activity
GROUP BY state;
```
---
*This database design provides a solid foundation for the MVP while being prepared for future SaaS features. The RLS policies ensure complete user data isolation, and the schema is optimized for the expected query patterns.*

446
docs/development_setup.md Normal file
View File

@@ -0,0 +1,446 @@
# JobForge MVP - Development Setup Guide
**Version:** 1.0.0 MVP
**Target Audience:** Developers
**Last Updated:** July 2025
---
## 🎯 Prerequisites
### Required Software
- **Docker Desktop** 4.20+ with Docker Compose
- **Git** 2.30+
- **Text Editor** (VS Code recommended)
- **API Keys** (Claude, OpenAI)
### System Requirements
- **RAM:** 8GB minimum (Docker containers + database)
- **Storage:** 10GB free space
- **OS:** Windows 10+, macOS 12+, or Linux
---
## 🚀 Quick Start (5 Minutes)
### 1. Clone Repository
```bash
git clone https://github.com/your-org/jobforge-mvp.git
cd jobforge-mvp
```
### 2. Environment Configuration
```bash
# Copy environment template
cp .env.example .env
# Edit .env file with your API keys
nano .env # or use your preferred editor
```
**Required Environment Variables:**
```bash
# API Keys (REQUIRED)
CLAUDE_API_KEY=your_claude_api_key_here
OPENAI_API_KEY=your_openai_api_key_here
# Database (Auto-configured for local development)
DATABASE_URL=postgresql+asyncpg://jobforge_user:jobforge_password@postgres:5432/jobforge_mvp
# JWT Secret (Generate random string)
JWT_SECRET_KEY=your-super-secret-jwt-key-change-this-in-production
# Development Settings
DEBUG=true
LOG_LEVEL=INFO
```
### 3. Start Development Environment
```bash
# Start all services (PostgreSQL, Backend, Frontend)
docker-compose up -d
# View logs to ensure everything started correctly
docker-compose logs -f
```
### 4. Verify Installation
- **Frontend:** http://localhost:8501
- **Backend API:** http://localhost:8000
- **API Documentation:** http://localhost:8000/docs
- **Database:** localhost:5432
---
## 🔧 Detailed Setup Instructions
### Getting API Keys
#### Claude API Key
1. Visit https://console.anthropic.com/
2. Create account or log in
3. Go to "API Keys" section
4. Create new key with name "JobForge Development"
5. Copy key to `.env` file
#### OpenAI API Key
1. Visit https://platform.openai.com/api-keys
2. Create account or log in
3. Click "Create new secret key"
4. Name it "JobForge Development"
5. Copy key to `.env` file
### Environment File Setup
```bash
# .env file (copy from .env.example)
# =============================================================================
# API KEYS - REQUIRED FOR DEVELOPMENT
# =============================================================================
CLAUDE_API_KEY=sk-ant-api03-xxx...
OPENAI_API_KEY=sk-xxx...
# =============================================================================
# DATABASE CONFIGURATION
# =============================================================================
DATABASE_URL=postgresql+asyncpg://jobforge_user:jobforge_password@postgres:5432/jobforge_mvp
POSTGRES_DB=jobforge_mvp
POSTGRES_USER=jobforge_user
POSTGRES_PASSWORD=jobforge_password
# =============================================================================
# AUTHENTICATION
# =============================================================================
JWT_SECRET_KEY=super-secret-jwt-key-minimum-32-characters-long
JWT_ALGORITHM=HS256
JWT_EXPIRE_HOURS=24
# =============================================================================
# APPLICATION SETTINGS
# =============================================================================
DEBUG=true
LOG_LEVEL=INFO
BACKEND_URL=http://backend:8000
# =============================================================================
# AI PROCESSING SETTINGS
# =============================================================================
CLAUDE_MODEL=claude-sonnet-4-20250514
OPENAI_EMBEDDING_MODEL=text-embedding-3-large
MAX_PROCESSING_TIME_SECONDS=120
```
---
## 🐳 Docker Setup
### Docker Compose Configuration
The `docker-compose.yml` file configures three main services:
#### PostgreSQL Database
- **Port:** 5432
- **Database:** jobforge_mvp
- **Extensions:** pgvector for AI embeddings
- **Data:** Persisted in Docker volume
#### Backend API (FastAPI)
- **Port:** 8000
- **Auto-reload:** Enabled for development
- **API Docs:** http://localhost:8000/docs
#### Frontend App (Dash)
- **Port:** 8501
- **Auto-reload:** Enabled for development
### Docker Commands
#### Start Services
```bash
# Start all services in background
docker-compose up -d
# Start with logs visible
docker-compose up
# Start specific service
docker-compose up backend
```
#### View Logs
```bash
# All services logs
docker-compose logs -f
# Specific service logs
docker-compose logs -f backend
docker-compose logs -f frontend
docker-compose logs -f postgres
```
#### Stop Services
```bash
# Stop all services
docker-compose down
# Stop and remove volumes (WARNING: Deletes database data)
docker-compose down -v
```
#### Rebuild Services
```bash
# Rebuild after code changes
docker-compose build
# Rebuild specific service
docker-compose build backend
# Rebuild and restart
docker-compose up --build
```
---
## 🗄️ Database Setup
### Automatic Database Initialization
The database is automatically set up when you first run `docker-compose up`:
1. **PostgreSQL starts** with pgvector extension
2. **Database created** with name `jobforge_mvp`
3. **Tables created** from `database/init.sql`
4. **Row Level Security** policies applied
5. **Sample data** inserted (optional)
### Manual Database Operations
#### Connect to Database
```bash
# Connect via Docker
docker-compose exec postgres psql -U jobforge_user -d jobforge_mvp
# Connect from host (if PostgreSQL client installed)
psql -h localhost -p 5432 -U jobforge_user -d jobforge_mvp
```
#### Reset Database
```bash
# WARNING: This deletes all data
docker-compose down -v
docker-compose up -d postgres
```
#### Database Migrations (Future)
```bash
# When Alembic is added
docker-compose exec backend alembic upgrade head
```
### Database Schema Verification
After startup, verify tables exist:
```sql
-- Connect to database and run:
\dt
-- Expected tables:
-- users
-- applications
-- documents
-- user_resumes
-- document_embeddings
```
---
## 🔍 Development Workflow
### Code Organization
```
src/
├── backend/ # FastAPI backend code
│ ├── main.py # FastAPI app entry point
│ ├── api/ # API route handlers
│ ├── services/ # Business logic
│ ├── database/ # Database models and connection
│ └── models/ # Pydantic request/response models
├── frontend/ # Dash frontend code
│ ├── main.py # Dash app entry point
│ ├── components/ # Reusable UI components
│ ├── pages/ # Page components
│ └── api_client/ # Backend API client
├── agents/ # AI processing agents
└── helpers/ # Shared utilities
```
### Making Code Changes
#### Backend Changes
1. **Modify code** in `src/backend/`
2. **FastAPI auto-reloads** automatically
3. **Test changes** at http://localhost:8000/docs
#### Frontend Changes
1. **Modify code** in `src/frontend/`
2. **Dash auto-reloads** automatically
3. **Test changes** at http://localhost:8501
#### Database Changes
1. **Modify** `database/init.sql`
2. **Reset database:** `docker-compose down -v && docker-compose up -d`
### Testing Your Setup
#### 1. Backend Health Check
```bash
curl http://localhost:8000/health
# Expected: {"status": "healthy", "service": "jobforge-backend"}
```
#### 2. Database Connection
```bash
curl http://localhost:8000/api/v1/auth/me
# Expected: {"detail": "Not authenticated"} (this is correct - no token)
```
#### 3. Frontend Access
Visit http://localhost:8501 - should see login page
#### 4. API Documentation
Visit http://localhost:8000/docs - should see Swagger UI
---
## 🐛 Troubleshooting
### Common Issues
#### "Port already in use"
```bash
# Check what's using the port
lsof -i :8501 # or :8000, :5432
# Kill the process or change ports in docker-compose.yml
```
#### "API Key Invalid"
```bash
# Verify API key format
echo $CLAUDE_API_KEY # Should start with "sk-ant-api03-"
echo $OPENAI_API_KEY # Should start with "sk-"
# Test API key manually
curl -H "Authorization: Bearer $CLAUDE_API_KEY" https://api.anthropic.com/v1/messages
```
#### "Database Connection Failed"
```bash
# Check if PostgreSQL is running
docker-compose ps postgres
# Check database logs
docker-compose logs postgres
# Try connecting manually
docker-compose exec postgres psql -U jobforge_user -d jobforge_mvp
```
#### "Frontend Won't Load"
```bash
# Check frontend logs
docker-compose logs frontend
# Common issue: Backend not ready
curl http://localhost:8000/health
# Restart frontend
docker-compose restart frontend
```
#### "AI Processing Fails"
```bash
# Check backend logs for API errors
docker-compose logs backend | grep -i error
# Verify API keys are loaded
docker-compose exec backend env | grep API_KEY
```
### Development Tips
#### Hot Reloading
- Both backend and frontend support hot reloading
- Database schema changes require full restart
- Environment variable changes require restart
#### Debugging
```bash
# Backend debugging with detailed logs
DEBUG=true LOG_LEVEL=DEBUG docker-compose up backend
# Frontend debugging
docker-compose exec frontend python src/frontend/main.py --debug
```
#### Performance Monitoring
```bash
# View container resource usage
docker stats
# View database performance
docker-compose exec postgres pg_stat_activity
```
---
## 📊 Development Tools
### Recommended VS Code Extensions
- **Python** (Microsoft)
- **Docker** (Microsoft)
- **PostgreSQL** (Chris Kolkman)
- **REST Client** (Huachao Mao)
- **GitLens** (GitKraken)
### API Testing Tools
- **Built-in Swagger UI:** http://localhost:8000/docs
- **curl commands** (see API specification)
- **Postman** (import OpenAPI spec from `/openapi.json`)
### Database Tools
- **pgAdmin** (web-based PostgreSQL admin)
- **DBeaver** (database IDE)
- **psql** (command line client)
---
## 🚀 Next Steps
Once your environment is running:
1. **Create test account** at http://localhost:8501
2. **Review API documentation** at http://localhost:8000/docs
3. **Follow Database Design** document for schema details
4. **Check Testing Strategy** document for testing approach
5. **Start development** following the 8-week roadmap
---
## 📞 Getting Help
### Development Issues
- Check this troubleshooting section first
- Review Docker logs: `docker-compose logs`
- Verify environment variables: `docker-compose exec backend env`
### API Issues
- Use Swagger UI for interactive testing
- Check API specification document
- Verify authentication headers
### Database Issues
- Connect directly: `docker-compose exec postgres psql -U jobforge_user -d jobforge_mvp`
- Check database logs: `docker-compose logs postgres`
- Review database design document
---
*This setup guide should get you from zero to a working development environment in under 10 minutes. If you encounter issues not covered here, please update this document for future developers.*

631
docs/git_branch_strategy.md Normal file
View File

@@ -0,0 +1,631 @@
# JobForge MVP - Git Branch Management Strategy
**Version:** 1.0.0 MVP
**Repository:** Single monorepo approach
**Platform:** Gitea
**Target Audience:** Development Team
**Last Updated:** July 2025
---
## 🎯 Branching Strategy Overview
### Repository Structure
**Single Monorepo** containing:
- Frontend (Dash + Mantine)
- Backend (FastAPI)
- Database schemas and migrations
- Docker configuration
- Documentation
- Tests
### Core Branching Model
```
main (production-ready)
├── develop (integration branch)
│ ├── feature/user-authentication
│ ├── feature/job-application-crud
│ ├── feature/ai-research-agent
│ ├── feature/resume-optimization
│ └── feature/cover-letter-generator
├── hotfix/critical-security-patch
└── release/v1.0.0-mvp
```
---
## 🌿 Branch Types & Purposes
### 1. **main** (Production Branch)
- **Purpose:** Production-ready code only
- **Protection:** Fully protected, requires PR approval
- **Deployment:** Auto-deploys to production environment
- **Merge Strategy:** Squash and merge from release branches only
**Rules:**
- No direct commits allowed
- All changes via Pull Request from `develop` or `hotfix/*`
- Must pass all CI/CD checks
- Requires at least 1 code review approval
- Must be deployable at any time
### 2. **develop** (Integration Branch)
- **Purpose:** Integration of completed features
- **Protection:** Protected, requires PR approval
- **Deployment:** Auto-deploys to staging environment
- **Merge Strategy:** Merge commits to preserve feature history
**Rules:**
- All feature branches merge here first
- Continuous integration testing
- Regular merges to `main` for releases
- Should be stable enough for testing
### 3. **feature/** (Feature Development)
- **Purpose:** Individual feature development
- **Naming:** `feature/[component]-[description]`
- **Lifecycle:** Created from `develop`, merged back to `develop`
- **Protection:** Optional, team discretion
**Examples:**
```
feature/backend-user-authentication
feature/frontend-application-sidebar
feature/ai-claude-integration
feature/database-rls-policies
feature/docker-development-setup
```
### 4. **hotfix/** (Emergency Fixes)
- **Purpose:** Critical production issues
- **Naming:** `hotfix/[issue-description]`
- **Lifecycle:** Created from `main`, merged to both `main` and `develop`
- **Priority:** Highest priority, fast-track review
### 5. **release/** (Release Preparation)
- **Purpose:** Prepare releases, final testing
- **Naming:** `release/v[version]`
- **Lifecycle:** Created from `develop`, merged to `main` when ready
- **Activities:** Bug fixes, documentation updates, version bumps
---
## 🔄 Development Workflow
### Standard Feature Development Flow
#### 1. Start New Feature
```bash
# Ensure develop is up to date
git checkout develop
git pull origin develop
# Create feature branch
git checkout -b feature/backend-application-service
git push -u origin feature/backend-application-service
```
#### 2. Development Cycle
```bash
# Regular commits with descriptive messages
git add .
git commit -m "feat(backend): implement application creation endpoint
- Add POST /api/v1/applications endpoint
- Implement application validation
- Add database integration
- Include unit tests
Closes #23"
# Push regularly to backup work
git push origin feature/backend-application-service
```
#### 3. Feature Completion
```bash
# Update with latest develop changes
git checkout develop
git pull origin develop
git checkout feature/backend-application-service
git rebase develop
# Push updated branch
git push origin feature/backend-application-service --force-with-lease
# Create Pull Request via Gitea UI
```
#### 4. Code Review & Merge
- **Create PR** from feature branch to `develop`
- **Code review** by at least 1 team member
- **CI/CD checks** must pass (tests, linting, etc.)
- **Merge** using "Merge commit" strategy
- **Delete** feature branch after merge
---
## 📋 Pull Request Guidelines
### PR Title Format
```
[type](scope): brief description
Examples:
feat(backend): add user authentication endpoints
fix(frontend): resolve sidebar navigation bug
docs(api): update endpoint documentation
test(database): add RLS policy tests
```
### PR Description Template
```markdown
## 🎯 Purpose
Brief description of what this PR accomplishes.
## 🔧 Changes Made
- [ ] Add new API endpoint for application creation
- [ ] Implement database integration
- [ ] Add unit tests with 90% coverage
- [ ] Update API documentation
## 🧪 Testing
- [ ] Unit tests pass
- [ ] Integration tests pass
- [ ] Manual testing completed
- [ ] Database migrations tested
## 📚 Documentation
- [ ] API documentation updated
- [ ] README updated if needed
- [ ] Code comments added for complex logic
## 🔍 Review Checklist
- [ ] Code follows project style guidelines
- [ ] No hardcoded secrets or credentials
- [ ] Error handling implemented
- [ ] Security considerations addressed
## 🔗 Related Issues
Closes #123
Relates to #456
```
### Review Criteria
**Mandatory Checks:**
- [ ] All CI/CD pipeline checks pass
- [ ] No merge conflicts with target branch
- [ ] At least 1 peer code review approval
- [ ] Tests cover new functionality
- [ ] Documentation updated
**Code Quality Checks:**
- [ ] Follows established coding standards
- [ ] No security vulnerabilities introduced
- [ ] Performance considerations addressed
- [ ] Error handling implemented properly
---
## 🚀 Release Management
### MVP Release Strategy
#### Phase 1 Releases (Weeks 1-8)
```
v0.1.0 - Week 2: Basic infrastructure
v0.2.0 - Week 4: User auth + application CRUD
v0.3.0 - Week 6: AI agents integration
v1.0.0 - Week 8: Complete MVP
```
#### Release Process
1. **Create Release Branch**
```bash
git checkout develop
git pull origin develop
git checkout -b release/v1.0.0-mvp
git push -u origin release/v1.0.0-mvp
```
2. **Prepare Release**
- Update version numbers in package files
- Update CHANGELOG.md
- Final testing and bug fixes
- Documentation review
3. **Release to Production**
```bash
# Create PR: release/v1.0.0-mvp → main
# After approval and merge:
git checkout main
git pull origin main
git tag -a v1.0.0 -m "MVP Release v1.0.0"
git push origin v1.0.0
```
4. **Post-Release Cleanup**
```bash
# Merge changes back to develop
git checkout develop
git merge main
git push origin develop
# Delete release branch
git branch -d release/v1.0.0-mvp
git push origin --delete release/v1.0.0-mvp
```
---
## 🔒 Branch Protection Rules
### main Branch Protection
```yaml
Protection Rules:
- Require pull request reviews: true
- Required approving reviews: 1
- Dismiss stale reviews: true
- Require review from code owners: true
- Restrict pushes to admins only: true
- Require status checks: true
- Required status checks:
- ci/backend-tests
- ci/frontend-tests
- ci/integration-tests
- ci/security-scan
- Require branches to be up to date: true
- Include administrators: true
```
### develop Branch Protection
```yaml
Protection Rules:
- Require pull request reviews: true
- Required approving reviews: 1
- Require status checks: true
- Required status checks:
- ci/backend-tests
- ci/frontend-tests
- ci/lint-check
- Require branches to be up to date: true
```
---
## 🤖 CI/CD Integration
### Automated Workflows by Branch
#### feature/* branches
```yaml
# .gitea/workflows/feature.yml
on:
push:
branches:
- 'feature/*'
pull_request:
branches:
- develop
jobs:
- lint-and-format
- unit-tests
- security-scan
- build-check
```
#### develop branch
```yaml
# .gitea/workflows/develop.yml
on:
push:
branches:
- develop
jobs:
- lint-and-format
- unit-tests
- integration-tests
- security-scan
- build-and-deploy-staging
```
#### main branch
```yaml
# .gitea/workflows/production.yml
on:
push:
branches:
- main
tags:
- 'v*'
jobs:
- full-test-suite
- security-scan
- build-and-deploy-production
- create-release-notes
```
---
## 📊 Development Team Workflow
### Team Roles & Responsibilities
#### **Backend Developer**
```bash
# Typical feature branches:
feature/backend-auth-service
feature/backend-application-api
feature/backend-ai-integration
feature/database-schema-updates
```
#### **Frontend Developer**
```bash
# Typical feature branches:
feature/frontend-auth-components
feature/frontend-application-dashboard
feature/frontend-document-editor
feature/ui-component-library
```
#### **Full-Stack Developer**
```bash
# End-to-end feature branches:
feature/complete-user-registration
feature/complete-application-workflow
feature/complete-document-management
```
#### **DevOps/Infrastructure**
```bash
# Infrastructure branches:
feature/docker-optimization
feature/ci-cd-pipeline
feature/monitoring-setup
feature/deployment-automation
```
### Daily Development Routine
#### Morning Sync
```bash
# Start each day with latest changes
git checkout develop
git pull origin develop
# Update feature branch
git checkout feature/your-current-feature
git rebase develop
# Resolve any conflicts and continue work
```
#### End of Day
```bash
# Commit and push daily progress
git add .
git commit -m "wip: progress on application service implementation"
git push origin feature/your-current-feature
```
---
## 🐛 Handling Common Scenarios
### Scenario 1: Feature Branch Behind develop
```bash
# Update feature branch with latest develop
git checkout feature/your-feature
git rebase develop
# If conflicts occur:
git status # See conflicted files
# Edit files to resolve conflicts
git add .
git rebase --continue
# Force push with lease (safer than --force)
git push origin feature/your-feature --force-with-lease
```
### Scenario 2: Emergency Hotfix
```bash
# Create hotfix from main
git checkout main
git pull origin main
git checkout -b hotfix/critical-security-fix
# Make fix
git add .
git commit -m "fix: resolve critical authentication vulnerability
- Patch JWT token validation
- Update security tests
- Add rate limiting
Fixes #emergency-issue"
# Push and create PRs to both main and develop
git push -u origin hotfix/critical-security-fix
# Create PR: hotfix/critical-security-fix → main (priority)
# Create PR: hotfix/critical-security-fix → develop
```
### Scenario 3: Large Feature Coordination
```bash
# For complex features requiring multiple developers:
# Main feature branch
feature/ai-agents-integration
# Sub-feature branches
feature/ai-agents-integration/research-agent
feature/ai-agents-integration/resume-optimizer
feature/ai-agents-integration/cover-letter-generator
# Merge sub-features to main feature branch first
# Then merge main feature branch to develop
```
---
## 📈 Branch Management Best Practices
### DO's ✅
- **Keep branches focused** - One feature per branch
- **Use descriptive names** - Clear what the branch does
- **Regular commits** - Small, focused commits with good messages
- **Rebase before merge** - Keep history clean
- **Delete merged branches** - Avoid branch pollution
- **Test before merge** - Ensure CI/CD passes
- **Review code thoroughly** - Maintain code quality
### DON'Ts ❌
- **Don't commit directly to main** - Always use PR workflow
- **Don't use generic branch names** - Avoid "fix", "update", "changes"
- **Don't let branches go stale** - Merge or close unused branches
- **Don't ignore conflicts** - Resolve properly, don't force
- **Don't skip testing** - Every merge should be tested
- **Don't merge your own PRs** - Always get peer review
### Naming Conventions
```bash
# Good branch names:
feature/backend-user-authentication
feature/frontend-application-sidebar
feature/ai-claude-integration
feature/database-migration-v2
hotfix/security-jwt-validation
release/v1.0.0-mvp
# Bad branch names:
fix-stuff
john-updates
temporary
new-feature
test-branch
```
---
## 🔧 Git Configuration
### Recommended Git Settings
```bash
# Set up git aliases for common operations
git config --global alias.co checkout
git config --global alias.br branch
git config --global alias.ci commit
git config --global alias.st status
git config --global alias.unstage 'reset HEAD --'
git config --global alias.last 'log -1 HEAD'
git config --global alias.visual '!gitk'
# Better log formatting
git config --global alias.lg "log --color --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset' --abbrev-commit"
# Set up pull to rebase by default (cleaner history)
git config --global pull.rebase true
# Set up push to current branch only
git config --global push.default current
```
### Team .gitignore
```gitignore
# Environment files
.env
.env.local
.env.*.local
# IDE files
.vscode/
.idea/
*.swp
*.swo
# OS files
.DS_Store
Thumbs.db
# Docker
.dockerignore
# Python
__pycache__/
*.pyc
*.pyo
*.pyd
.Python
env/
venv/
ENV/
# Node modules (if using)
node_modules/
# Logs
*.log
logs/
# Database
*.db
*.sqlite
# Temporary files
*.tmp
*.temp
temp/
tmp/
# Coverage reports
htmlcov/
.coverage
.coverage.*
# Test outputs
.pytest_cache/
.tox/
```
---
## 📚 Integration with Development Documents
### Relationship to Other Documents
- **Development Setup:** Clone from correct branch, set up environment
- **API Specification:** Feature branches should implement specific endpoints
- **Database Design:** Schema changes require migration planning
- **Testing Strategy:** All branches must pass defined test suites
### 8-Week MVP Timeline Integration
```
Week 1-2: Foundation
├── feature/docker-development-setup
├── feature/database-initial-schema
└── feature/backend-project-structure
Week 3-4: Core Features
├── feature/backend-user-authentication
├── feature/frontend-auth-components
└── feature/backend-application-crud
Week 5-6: AI Integration
├── feature/ai-claude-integration
├── feature/ai-research-agent
└── feature/ai-resume-optimizer
Week 7-8: Polish & Release
├── feature/frontend-document-editor
├── feature/error-handling-improvements
└── release/v1.0.0-mvp
```
---
*This Git branching strategy ensures clean, maintainable code while supporting parallel development and safe deployments for the JobForge MVP. The strategy scales from MVP development to future SaaS features.*

View File

@@ -1,833 +0,0 @@
# JobForge - Architecture Guide
**Version:** 1.0.0
**Status:** Production-Ready Implementation
**Date:** July 2025
**Target Market:** Canadian Job Market Applications
**Tagline:** "Forge Your Path to Success"
---
## 📋 Executive Summary
### Project Vision
JobForge is a comprehensive, AI-powered job application management system that streamlines the entire application process through intelligent automation, multi-resume optimization, and authentic voice preservation for professional job seekers in the Canadian market.
### Core Objectives
- **Workflow Automation**: 3-phase intelligent application pipeline (Research → Resume → Cover Letter)
- **Multi-Resume Intelligence**: Leverage multiple resume versions as focused expertise lenses
- **Authentic Voice Preservation**: Maintain candidate's proven successful writing patterns
- **Canadian Market Focus**: Optimize for Canadian business culture and application standards
- **Local File Management**: Complete control over sensitive career documents
- **Scalable Architecture**: Support for high-volume job application campaigns
### Business Value Proposition
- **40% Time Reduction**: Automated research and document generation
- **Higher Success Rates**: Strategic positioning based on comprehensive analysis
- **Consistent Quality**: Standardized excellence across all applications
- **Document Security**: Local storage with full user control
- **Career Intelligence**: Build knowledge base from successful applications
---
## 🏗️ High-Level Architecture
### System Overview
```mermaid
graph TB
subgraph "User Interface Layer"
A[Streamlit Web UI]
B[Configuration Panel]
C[File Management UI]
D[Workflow Interface]
end
subgraph "Application Core"
E[Application Engine]
F[Phase Orchestrator]
G[State Manager]
H[File Controller]
end
subgraph "AI Processing Layer"
I[Research Agent]
J[Resume Optimizer]
K[Cover Letter Generator]
L[Claude API Client]
end
subgraph "Data Management"
M[Resume Repository]
N[Reference Database]
O[Application Store]
P[Status Tracker]
end
subgraph "Storage Layer"
Q[Local File System]
R[Project Structure]
S[Document Templates]
end
subgraph "External Services"
T[Claude AI API]
U[Web Search APIs]
V[Company Intelligence]
end
A --> E
B --> H
C --> H
D --> F
E --> I
E --> J
E --> K
F --> G
I --> L
J --> L
K --> L
L --> T
M --> Q
N --> Q
O --> Q
P --> Q
H --> R
I --> U
I --> V
```
### Architecture Principles
#### **1. Domain-Driven Design**
- Clear separation between job application domain logic and technical infrastructure
- Rich domain models representing real-world career management concepts
- Business rules encapsulated within domain entities
#### **2. Event-Driven Workflow**
- Each phase triggers the next through well-defined events
- State transitions logged for auditability and recovery
- Asynchronous processing with real-time UI updates
#### **3. Multi-Source Intelligence**
- Resume portfolio treated as complementary expertise views
- Reference database provides voice pattern templates
- Company research aggregated from multiple sources
#### **4. Security-First Design**
- All sensitive career data stored locally
- No cloud storage of personal information
- API keys managed through secure environment variables
---
## 🔧 Core Components
### **Application Engine**
```python
class JobApplicationEngine:
"""Central orchestrator for the entire application workflow"""
def __init__(self, config: EngineConfig, file_manager: FileManager):
self.config = config
self.file_manager = file_manager
self.phase_orchestrator = PhaseOrchestrator()
self.state_manager = StateManager()
# Core workflow methods
def create_application(self, job_data: JobData) -> Application
def execute_research_phase(self, app_id: str) -> ResearchReport
def optimize_resume(self, app_id: str, research: ResearchReport) -> OptimizedResume
def generate_cover_letter(self, app_id: str, context: ApplicationContext) -> CoverLetter
# Management operations
def list_applications(self, status_filter: str = None) -> List[Application]
def update_application_status(self, app_id: str, status: ApplicationStatus) -> None
def export_application(self, app_id: str, format: ExportFormat) -> str
```
**Responsibilities:**
- Coordinate all application lifecycle operations
- Manage state transitions between phases
- Integrate with AI processing agents
- Handle file system operations through delegates
### **Phase Orchestrator**
```python
class PhaseOrchestrator:
"""Manages the 3-phase workflow execution and state transitions"""
class Phases(Enum):
INPUT = "input"
RESEARCH = "research"
RESUME = "resume"
COVER_LETTER = "cover_letter"
COMPLETE = "complete"
def execute_phase(self, phase: Phases, context: PhaseContext) -> PhaseResult
def can_advance_to(self, target_phase: Phases, current_state: ApplicationState) -> bool
def get_phase_requirements(self, phase: Phases) -> List[Requirement]
# Phase-specific execution
async def execute_research(self, job_data: JobData, resume_portfolio: List[Resume]) -> ResearchReport
async def execute_resume_optimization(self, research: ResearchReport, portfolio: ResumePortfolio) -> OptimizedResume
async def execute_cover_letter_generation(self, context: ApplicationContext) -> CoverLetter
```
**Design Features:**
- State machine implementation for workflow control
- Async execution with progress callbacks
- Dependency validation between phases
- Rollback capability for failed phases
### **AI Processing Agents**
#### **Research Agent**
```python
class ResearchAgent:
"""Phase 1: Comprehensive job description analysis and strategic positioning"""
def __init__(self, claude_client: ClaudeAPIClient, web_search: WebSearchClient):
self.claude = claude_client
self.web_search = web_search
async def analyze_job_description(self, job_desc: str) -> JobAnalysis:
"""Extract and categorize job requirements, company info, and keywords"""
async def assess_candidate_fit(self, job_analysis: JobAnalysis, resume_portfolio: ResumePortfolio) -> FitAssessment:
"""Multi-resume skills assessment with transferability analysis"""
async def research_company_intelligence(self, company_name: str) -> CompanyIntelligence:
"""Gather company culture, recent news, and strategic insights"""
async def generate_strategic_positioning(self, context: ResearchContext) -> StrategicPositioning:
"""Determine optimal candidate positioning and competitive advantages"""
```
#### **Resume Optimizer**
```python
class ResumeOptimizer:
"""Phase 2: Multi-resume synthesis and strategic optimization"""
def __init__(self, claude_client: ClaudeAPIClient, config: OptimizationConfig):
self.claude = claude_client
self.config = config # 600-word limit, formatting rules, etc.
async def synthesize_resume_portfolio(self, portfolio: ResumePortfolio, research: ResearchReport) -> SynthesizedContent:
"""Merge insights from multiple resume versions"""
async def optimize_for_job(self, content: SynthesizedContent, positioning: StrategicPositioning) -> OptimizedResume:
"""Create targeted resume within word limits"""
def validate_optimization(self, resume: OptimizedResume) -> OptimizationReport:
"""Ensure word count, keyword density, and strategic alignment"""
```
#### **Cover Letter Generator**
```python
class CoverLetterGenerator:
"""Phase 3: Authentic voice preservation and company-specific customization"""
def __init__(self, claude_client: ClaudeAPIClient, reference_db: ReferenceDatabase):
self.claude = claude_client
self.reference_db = reference_db
async def analyze_voice_patterns(self, selected_references: List[CoverLetterReference]) -> VoiceProfile:
"""Extract authentic writing style, tone, and structural patterns"""
async def generate_cover_letter(self, context: CoverLetterContext, voice_profile: VoiceProfile) -> CoverLetter:
"""Create authentic cover letter using proven voice patterns"""
def validate_authenticity(self, cover_letter: CoverLetter, voice_profile: VoiceProfile) -> AuthenticityScore:
"""Ensure generated content matches authentic voice patterns"""
```
### **Data Models**
```python
class Application(BaseModel):
"""Core application entity with full lifecycle management"""
id: str
name: str # company_role_YYYY_MM_DD format
status: ApplicationStatus
created_at: datetime
updated_at: datetime
# Job information
job_data: JobData
company_info: CompanyInfo
# Phase results
research_report: Optional[ResearchReport] = None
optimized_resume: Optional[OptimizedResume] = None
cover_letter: Optional[CoverLetter] = None
# Metadata
priority_level: PriorityLevel
application_deadline: Optional[date] = None
# Business logic
@property
def completion_percentage(self) -> float
def can_advance_to_phase(self, phase: PhaseOrchestrator.Phases) -> bool
def export_to_format(self, format: ExportFormat) -> str
class ResumePortfolio(BaseModel):
"""Collection of focused resume versions representing different expertise areas"""
resumes: List[Resume]
def get_technical_focused(self) -> List[Resume]
def get_management_focused(self) -> List[Resume]
def get_industry_specific(self, industry: str) -> List[Resume]
def synthesize_skills(self) -> SkillMatrix
class JobData(BaseModel):
"""Comprehensive job posting information"""
job_url: Optional[str] = None
job_description: str
company_name: str
role_title: str
location: str
priority_level: PriorityLevel
how_found: str
application_deadline: Optional[date] = None
# Additional context
specific_aspects: Optional[str] = None
company_insights: Optional[str] = None
special_considerations: Optional[str] = None
```
---
## 📊 Data Flow Architecture
### Application Creation Flow
```mermaid
sequenceDiagram
participant UI as Streamlit UI
participant Engine as Application Engine
participant FileManager as File Manager
participant Storage as Local Storage
UI->>Engine: create_application(job_data)
Engine->>Engine: validate_job_data()
Engine->>Engine: generate_application_name()
Engine->>FileManager: create_application_folder()
FileManager->>Storage: mkdir(company_role_date)
FileManager->>Storage: save(user_inputs.json)
FileManager->>Storage: save(original_job_description.md)
FileManager->>Storage: save(application_status.json)
Engine-->>UI: Application(id, status=created)
```
### 3-Phase Workflow Execution
```mermaid
flowchart TD
A[Application Created] --> B[Phase 1: Research]
B --> C{Research Complete?}
C -->|Yes| D[Phase 2: Resume]
C -->|No| E[Research Error]
D --> F{Resume Complete?}
F -->|Yes| G[Phase 3: Cover Letter]
F -->|No| H[Resume Error]
G --> I{Cover Letter Complete?}
I -->|Yes| J[Application Complete]
I -->|No| K[Cover Letter Error]
E --> L[Log Error & Retry]
H --> L
K --> L
L --> M[Manual Intervention]
subgraph "Phase 1 Details"
B1[Job Analysis]
B2[Multi-Resume Assessment]
B3[Company Research]
B4[Strategic Positioning]
B --> B1 --> B2 --> B3 --> B4 --> C
end
subgraph "Phase 2 Details"
D1[Portfolio Synthesis]
D2[Content Optimization]
D3[Word Count Management]
D4[Strategic Alignment]
D --> D1 --> D2 --> D3 --> D4 --> F
end
subgraph "Phase 3 Details"
G1[Voice Analysis]
G2[Content Generation]
G3[Authenticity Validation]
G4[Company Customization]
G --> G1 --> G2 --> G3 --> G4 --> I
end
```
### File Management Architecture
```mermaid
graph TB
subgraph "Project Root"
A[job-application-engine/]
end
subgraph "User Data"
B[user_data/resumes/]
C[user_data/cover_letter_references/selected/]
D[user_data/cover_letter_references/other/]
end
subgraph "Applications"
E[applications/company_role_date/]
F[├── original_job_description.md]
G[├── research_report.md]
H[├── optimized_resume.md]
I[├── cover_letter.md]
J[├── user_inputs.json]
K[└── application_status.json]
end
subgraph "Configuration"
L[config/]
M[├── engine_config.yaml]
N[├── claude_api_config.json]
O[└── templates/]
end
A --> B
A --> C
A --> D
A --> E
A --> L
E --> F
E --> G
E --> H
E --> I
E --> J
E --> K
```
---
## 🗂️ Project Structure
### Directory Layout
```
job-application-engine/
├── app.py # Streamlit main application
├── requirements.txt # Python dependencies
├── config/
│ ├── engine_config.yaml # Engine configuration
│ ├── claude_api_config.json # API configuration
│ └── templates/ # Document templates
│ ├── research_template.md
│ ├── resume_template.md
│ └── cover_letter_template.md
├── src/ # Source code
│ ├── __init__.py
│ ├── engine/ # Core engine
│ │ ├── __init__.py
│ │ ├── application_engine.py # Main engine class
│ │ ├── phase_orchestrator.py # Workflow management
│ │ └── state_manager.py # State tracking
│ ├── agents/ # AI processing agents
│ │ ├── __init__.py
│ │ ├── research_agent.py # Phase 1: Research
│ │ ├── resume_optimizer.py # Phase 2: Resume
│ │ ├── cover_letter_generator.py # Phase 3: Cover Letter
│ │ └── claude_client.py # Claude API integration
│ ├── models/ # Data models
│ │ ├── __init__.py
│ │ ├── application.py # Application entity
│ │ ├── job_data.py # Job information
│ │ ├── resume.py # Resume models
│ │ └── results.py # Phase results
│ ├── storage/ # Storage management
│ │ ├── __init__.py
│ │ ├── file_manager.py # File operations
│ │ ├── application_store.py # Application persistence
│ │ └── reference_database.py # Cover letter references
│ ├── ui/ # User interface
│ │ ├── __init__.py
│ │ ├── streamlit_app.py # Streamlit components
│ │ ├── workflow_ui.py # Workflow interface
│ │ └── file_management_ui.py # File management
│ └── utils/ # Utilities
│ ├── __init__.py
│ ├── validators.py # Input validation
│ ├── formatters.py # Output formatting
│ └── helpers.py # Helper functions
├── user_data/ # User's career documents
│ ├── resumes/
│ │ ├── resume_complete.md
│ │ ├── resume_technical.md
│ │ └── resume_management.md
│ └── cover_letter_references/
│ ├── selected/ # Tagged as references
│ │ ├── cover_letter_tech.md
│ │ └── cover_letter_consulting.md
│ └── other/ # Available references
│ └── cover_letter_finance.md
├── applications/ # Generated applications
│ ├── dillon_consulting_data_analyst_2025_07_22/
│ └── shopify_senior_developer_2025_07_23/
├── tests/ # Test suite
│ ├── unit/
│ ├── integration/
│ └── fixtures/
├── docs/ # Documentation
│ ├── architecture.md
│ ├── user_guide.md
│ └── api_reference.md
└── scripts/ # Utility scripts
├── setup_project.py
└── backup_applications.py
```
### Module Responsibilities
| Module | Purpose | Key Classes | Dependencies |
|--------|---------|-------------|--------------|
| `engine/` | Core workflow orchestration | `ApplicationEngine`, `PhaseOrchestrator` | `agents/`, `models/` |
| `agents/` | AI processing logic | `ResearchAgent`, `ResumeOptimizer`, `CoverLetterGenerator` | `models/`, `utils/` |
| `models/` | Data structures and business logic | `Application`, `JobData`, `Resume`, `ResumePortfolio` | `pydantic` |
| `storage/` | File system operations | `FileManager`, `ApplicationStore`, `ReferenceDatabase` | `pathlib`, `json` |
| `ui/` | User interface components | `StreamlitApp`, `WorkflowUI`, `FileManagementUI` | `streamlit` |
| `utils/` | Cross-cutting concerns | `Validators`, `Formatters`, `Helpers` | Various |
---
## 🔌 Extensibility Architecture
### Plugin System Design
```python
class EnginePlugin(ABC):
"""Base plugin interface for extending engine functionality"""
def before_phase_execution(self, phase: PhaseOrchestrator.Phases, context: PhaseContext) -> PhaseContext:
"""Modify context before phase execution"""
return context
def after_phase_completion(self, phase: PhaseOrchestrator.Phases, result: PhaseResult) -> PhaseResult:
"""Process result after phase completion"""
return result
def on_application_created(self, application: Application) -> None:
"""React to new application creation"""
pass
class MetricsPlugin(EnginePlugin):
"""Collect application performance metrics"""
def after_phase_completion(self, phase: PhaseOrchestrator.Phases, result: PhaseResult) -> PhaseResult:
self.record_phase_metrics(phase, result.execution_time, result.success)
return result
class BackupPlugin(EnginePlugin):
"""Automatic backup of application data"""
def on_application_created(self, application: Application) -> None:
self.backup_application(application)
```
### Configuration System
```python
@dataclass
class EngineConfig:
# Core settings
claude_api_key: str
base_output_directory: str = "./applications"
max_concurrent_phases: int = 1
# AI processing
research_model: str = "claude-sonnet-4-20250514"
resume_word_limit: int = 600
cover_letter_word_range: tuple = (350, 450)
# File management
auto_backup_enabled: bool = True
backup_retention_days: int = 30
# UI preferences
streamlit_theme: str = "light"
show_advanced_options: bool = False
# Extensions
enabled_plugins: List[str] = field(default_factory=list)
@classmethod
def from_file(cls, config_path: str) -> 'EngineConfig':
"""Load configuration from YAML file"""
def validate(self) -> List[ValidationError]:
"""Validate configuration completeness and correctness"""
```
### Multi-Resume Strategy Pattern
```python
class ResumeSelectionStrategy(ABC):
"""Strategy for selecting optimal resume content for specific jobs"""
def select_primary_resume(self, portfolio: ResumePortfolio, job_analysis: JobAnalysis) -> Resume:
"""Select the most relevant primary resume"""
def get_supplementary_content(self, portfolio: ResumePortfolio, primary: Resume) -> List[ResumeSection]:
"""Extract additional content from other resume versions"""
class TechnicalRoleStrategy(ResumeSelectionStrategy):
"""Optimize resume selection for technical positions"""
class ManagementRoleStrategy(ResumeSelectionStrategy):
"""Optimize resume selection for management positions"""
class ConsultingRoleStrategy(ResumeSelectionStrategy):
"""Optimize resume selection for consulting positions"""
```
---
## 🚀 Development Phases
### **Phase 1: MVP Foundation (Completed)**
- ✅ Streamlit UI with file management
- ✅ 3-phase workflow execution
- ✅ Claude API integration
- ✅ Local file storage system
- ✅ Multi-resume processing
- ✅ Cover letter reference system
- ✅ Application status tracking
### **Phase 2: Enhanced Intelligence (Next)**
- 🔄 Advanced company research integration
- 🔄 Improved multi-resume synthesis algorithms
- 🔄 Voice pattern analysis enhancement
- 🔄 Strategic positioning optimization
- 🔄 Application performance analytics
- 🔄 Export functionality (PDF, Word, etc.)
### **Phase 3: Automation & Scale (Future)**
- 📋 Batch application processing
- 📋 Template management system
- 📋 Application campaign planning
- 📋 Success rate tracking and optimization
- 📋 Integration with job boards APIs
- 📋 Automated application submission
### **Phase 4: Enterprise Features (Future)**
- 📋 Multi-user support with role-based access
- 📋 Team collaboration features
- 📋 Advanced analytics and reporting
- 📋 Custom workflow templates
- 📋 Integration with HR systems
- 📋 White-label deployment options
---
## 🎯 Technical Specifications
### **Technology Stack**
| Component | Technology | Version | Rationale |
|-----------|------------|---------|-----------|
| **UI Framework** | Streamlit | 1.28.1 | Rapid prototyping, built-in components, Python-native |
| **HTTP Client** | requests | 2.31.0 | Reliable, well-documented, synchronous operations |
| **Data Validation** | Pydantic | 2.0+ | Type safety, automatic validation, great developer experience |
| **File Operations** | pathlib | Built-in | Modern, object-oriented path handling |
| **Configuration** | PyYAML | 6.0+ | Human-readable configuration files |
| **CLI Future** | Click + Rich | Latest | User-friendly CLI with beautiful output |
| **Testing** | pytest | 7.0+ | Comprehensive testing framework |
| **Documentation** | MkDocs | 1.5+ | Beautiful, searchable documentation |
### **Performance Requirements**
| Metric | Target | Measurement Method |
|--------|--------|-------------------|
| **Application Creation** | <2 seconds | Time from form submission to folder creation |
| **Phase 1 Research** | <30 seconds | Claude API response + processing time |
| **Phase 2 Resume** | <20 seconds | Multi-resume synthesis + optimization |
| **Phase 3 Cover Letter** | <15 seconds | Voice analysis + content generation |
| **File Operations** | <1 second | Local file read/write operations |
| **UI Responsiveness** | <500ms | Streamlit component render time |
### **Quality Standards**
#### **Code Quality Metrics**
- **Type Coverage**: 90%+ type hints on all public APIs
- **Test Coverage**: 85%+ line coverage maintained
- **Documentation**: All public methods and classes documented
- **Code Style**: Black formatter + isort + flake8 compliance
- **Complexity**: Max cyclomatic complexity of 10 per function
#### **Security Requirements**
- No API keys hardcoded in source code
- Environment variable management for secrets
- Input sanitization for all user data
- Safe file path handling to prevent directory traversal
- Regular dependency vulnerability scanning
#### **Reliability Standards**
- Graceful handling of API failures with user-friendly messages
- Automatic retry logic for transient failures
- Data integrity validation after file operations
- Rollback capability for failed workflow phases
- Comprehensive error logging with context
---
## 📈 Monitoring & Analytics
### **Application Metrics**
```python
class ApplicationMetrics:
"""Track application performance and success rates"""
def record_application_created(self, app: Application) -> None
def record_phase_completion(self, app_id: str, phase: PhaseOrchestrator.Phases, duration: float) -> None
def record_application_submitted(self, app_id: str) -> None
def record_application_response(self, app_id: str, response_type: ResponseType) -> None
# Analytics queries
def get_success_rate(self, date_range: DateRange) -> float
def get_average_completion_time(self, phase: PhaseOrchestrator.Phases) -> float
def get_most_effective_strategies(self) -> List[StrategyMetric]
```
### **Performance Monitoring**
```python
class PerformanceMonitor:
"""Monitor system performance and resource usage"""
def track_api_response_times(self) -> Dict[str, float]
def monitor_file_system_usage(self) -> StorageMetrics
def track_memory_usage(self) -> MemoryMetrics
def generate_performance_report(self) -> PerformanceReport
```
### **User Experience Analytics**
- Workflow completion rates by phase
- Most common user pain points
- Feature usage statistics
- Error frequency and resolution rates
- Time-to-value metrics
---
## 🔒 Security Architecture
### **Data Protection Strategy**
- **Local-First**: All sensitive career data stored locally
- **API Key Management**: Secure environment variable handling
- **Input Validation**: Comprehensive sanitization of all user inputs
- **File System Security**: Restricted file access patterns
- **Audit Trail**: Complete logging of all file operations
### **Privacy Considerations**
- No personal data transmitted to third parties (except Claude API for processing)
- User control over all data retention and deletion
- Transparent data usage policies
- Optional anonymization for analytics
---
## 🎨 User Experience Design
### **Design Principles**
1. **Simplicity First**: Complex AI power hidden behind simple interfaces
2. **Progress Transparency**: Clear feedback on all processing steps
3. **Error Recovery**: Graceful handling with actionable next steps
4. **Customization**: Flexible configuration without overwhelming options
5. **Mobile Friendly**: Responsive design for various screen sizes
### **User Journey Optimization**
```mermaid
journey
title Job Application Creation Journey
section Setup
Configure folders: 5: User
Upload resumes: 4: User
Tag references: 3: User
section Application
Paste job description: 5: User
Review auto-generated name: 4: User
Start research phase: 5: User
section AI Processing
Wait for research: 3: User, AI
Review research results: 4: User
Approve resume optimization: 5: User, AI
Review cover letter: 5: User, AI
section Completion
Make final edits: 4: User
Export documents: 5: User
Mark as applied: 5: User
```
---
## 📚 Documentation Strategy
### **Documentation Hierarchy**
1. **Architecture Guide** (This Document) - Technical architecture and design decisions
2. **User Guide** - Step-by-step usage instructions with screenshots
3. **API Reference** - Detailed API documentation for extensions
4. **Developer Guide** - Setup, contribution guidelines, and development practices
5. **Troubleshooting Guide** - Common issues and solutions
### **Documentation Standards**
- All public APIs documented with docstrings
- Code examples for all major features
- Screenshots for UI components
- Video tutorials for complex workflows
- Regular documentation updates with each release
---
## 🚀 Deployment & Distribution
### **Distribution Strategy**
- **GitHub Repository**: Open source with comprehensive documentation
- **PyPI Package**: Easy installation via pip
- **Docker Container**: Containerized deployment option
- **Executable Bundle**: Standalone executable for non-technical users
### **Deployment Options**
```python
# Option 1: Direct Python execution
python -m streamlit run app.py
# Option 2: Docker deployment
docker run -p 8501:8501 job-application-engine
# Option 3: Heroku deployment
git push heroku main
# Option 4: Local installation
pip install job-application-engine
job-app-engine --config myconfig.yaml
```
---
## 🔮 Future Enhancements
### **Advanced AI Features**
- **Multi-Model Support**: Integration with multiple AI providers
- **Specialized Models**: Domain-specific fine-tuned models
- **Continuous Learning**: System learns from successful applications
- **Predictive Analytics**: Success probability estimation
### **Integration Ecosystem**
- **LinkedIn Integration**: Auto-import job postings and company data
- **ATS Integration**: Direct submission to Applicant Tracking Systems
- **CRM Integration**: Track application pipeline in existing CRM
- **Calendar Integration**: Application deadline management
### **Enterprise Features**
- **Multi-Tenant Architecture**: Support multiple users/organizations
- **Role-Based Access Control**: Team collaboration with permission levels
- **Workflow Customization**: Industry-specific workflow templates
- **Advanced Analytics**: Success attribution and optimization recommendations
---
*This architecture guide serves as the authoritative reference for the Job Application Engine system design and implementation. For implementation details, see the source code and technical documentation.*
*For questions or contributions, please refer to the project repository and contribution guidelines.*

View File

@@ -0,0 +1,714 @@
# JobForge MVP - Core Job Application Module
**Version:** 1.0.0 MVP
**Status:** Development Phase 1
**Date:** July 2025
**Scope:** Core job application workflow with essential features
**Target:** Personal use for concept validation and testing
---
## 📋 MVP Scope & Objectives
### Core Functionality
- **User Authentication**: Basic login/signup system
- **Job Application Creation**: Add new applications with job description and URL
- **3-Phase AI Workflow**: Research → Resume → Cover Letter generation
- **Document Management**: View and edit generated documents
- **Navigation Interface**: Sidebar + top bar for seamless workflow navigation
### MVP Goals
- Validate core AI workflow effectiveness
- Test user experience with Dash + Mantine interface
- Prove concept with personal job application journey
- Establish foundation for Phase 2 (post-application features)
---
## 🏗️ MVP Architecture
### System Overview
```mermaid
graph TB
subgraph "Frontend (Dash + Mantine)"
UI[Main UI]
SIDEBAR[Application Sidebar]
TOPBAR[Navigation Top Bar]
EDITOR[Document Editor]
end
subgraph "Backend API (FastAPI)"
AUTH[Authentication]
APP[Application Service]
AI[AI Orchestrator]
DOC[Document Service]
end
subgraph "AI Agents"
RESEARCH[Research Agent]
RESUME[Resume Optimizer]
COVER[Cover Letter Generator]
end
subgraph "Data Storage"
PG[(PostgreSQL + pgvector)]
FILES[Document Storage]
end
subgraph "External AI"
CLAUDE[Claude AI]
OPENAI[OpenAI Embeddings]
end
UI --> AUTH
UI --> APP
UI --> DOC
APP --> AI
AI --> RESEARCH
AI --> RESUME
AI --> COVER
APP --> PG
DOC --> FILES
RESEARCH --> CLAUDE
RESUME --> CLAUDE
COVER --> CLAUDE
AI --> OPENAI
```
---
## 🔐 User Authentication (MVP)
### Simple Authentication System
```python
class AuthenticationService:
"""Basic user authentication for MVP"""
async def register_user(self, email: str, password: str, name: str) -> User:
"""Register new user account"""
async def authenticate_user(self, email: str, password: str) -> AuthResult:
"""Login user and return JWT token"""
async def verify_token(self, token: str) -> User:
"""Verify JWT token and return user"""
async def logout_user(self, user_id: str) -> None:
"""Logout user session"""
```
### Database Schema (Users)
```sql
-- Basic user table for MVP
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
email VARCHAR(255) UNIQUE NOT NULL,
password_hash VARCHAR(255) NOT NULL,
full_name VARCHAR(255) NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
-- Enable basic row level security
ALTER TABLE users ENABLE ROW LEVEL SECURITY;
```
---
## 📋 Job Application Module
### Core Application Workflow
```python
class ApplicationService:
"""Core job application management"""
async def create_application(self, user_id: str, job_data: JobApplicationData) -> Application:
"""Create new job application with job description and URL"""
async def get_user_applications(self, user_id: str) -> List[Application]:
"""Get all applications for user"""
async def get_application(self, user_id: str, app_id: str) -> Application:
"""Get specific application with documents"""
async def update_application_status(self, user_id: str, app_id: str, status: str) -> None:
"""Update application status through workflow phases"""
```
### Application Data Model
```python
class JobApplicationData(BaseModel):
"""Input data for creating new application"""
job_url: Optional[str] = None
job_description: str
company_name: str
role_title: str
location: Optional[str] = None
priority_level: str = "medium"
additional_context: Optional[str] = None
class Application(BaseModel):
"""Core application entity"""
id: str
user_id: str
name: str # Auto-generated: company_role_YYYY_MM_DD
company_name: str
role_title: str
job_url: Optional[str]
job_description: str
status: ApplicationStatus # draft, research_complete, resume_ready, cover_letter_ready
# Phase completion tracking
research_completed: bool = False
resume_optimized: bool = False
cover_letter_generated: bool = False
created_at: datetime
updated_at: datetime
```
### Database Schema (Applications)
```sql
CREATE TABLE applications (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID REFERENCES users(id) ON DELETE CASCADE,
name VARCHAR(255) NOT NULL,
company_name VARCHAR(255) NOT NULL,
role_title VARCHAR(255) NOT NULL,
job_url TEXT,
job_description TEXT NOT NULL,
location VARCHAR(255),
priority_level VARCHAR(20) DEFAULT 'medium',
status VARCHAR(50) DEFAULT 'draft',
-- Phase tracking
research_completed BOOLEAN DEFAULT FALSE,
resume_optimized BOOLEAN DEFAULT FALSE,
cover_letter_generated BOOLEAN DEFAULT FALSE,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
ALTER TABLE applications ENABLE ROW LEVEL SECURITY;
CREATE POLICY user_applications_policy ON applications
FOR ALL TO application_user
USING (user_id = current_setting('app.current_user_id')::UUID);
```
---
## 🤖 AI Processing Workflow
### 3-Phase AI Orchestrator
```python
class AIOrchestrator:
"""Orchestrates the 3-phase AI workflow"""
def __init__(self, research_agent, resume_optimizer, cover_letter_generator):
self.research_agent = research_agent
self.resume_optimizer = resume_optimizer
self.cover_letter_generator = cover_letter_generator
async def execute_research_phase(self, application_id: str) -> ResearchReport:
"""Phase 1: Job analysis and company research"""
async def execute_resume_optimization(self, application_id: str) -> OptimizedResume:
"""Phase 2: Resume optimization based on research"""
async def execute_cover_letter_generation(self, application_id: str, user_context: str) -> CoverLetter:
"""Phase 3: Cover letter generation with user inputs"""
```
### Phase 1: Research Agent
```python
class ResearchAgent:
"""Job description analysis and company research"""
async def analyze_job_description(self, job_desc: str) -> JobAnalysis:
"""Extract requirements, skills, and key information"""
async def research_company_info(self, company_name: str) -> CompanyIntelligence:
"""Basic company research and insights"""
async def generate_strategic_positioning(self, job_analysis: JobAnalysis) -> StrategicPositioning:
"""Determine optimal candidate positioning"""
async def create_research_report(self, job_desc: str, company_name: str) -> ResearchReport:
"""Complete research phase output"""
```
### Phase 2: Resume Optimizer
```python
class ResumeOptimizer:
"""Resume optimization based on job requirements"""
async def analyze_resume_portfolio(self, user_id: str) -> ResumePortfolio:
"""Load and analyze user's resume library"""
async def optimize_resume_for_job(self, portfolio: ResumePortfolio, research: ResearchReport) -> OptimizedResume:
"""Create job-specific optimized resume"""
async def validate_resume_optimization(self, resume: OptimizedResume) -> ValidationReport:
"""Ensure resume meets requirements and constraints"""
```
### Phase 3: Cover Letter Generator
```python
class CoverLetterGenerator:
"""Cover letter generation with user context"""
async def analyze_writing_style(self, user_id: str) -> WritingStyle:
"""Analyze user's writing patterns from reference documents"""
async def generate_cover_letter(self, research: ResearchReport, resume: OptimizedResume,
user_context: str, writing_style: WritingStyle) -> CoverLetter:
"""Generate personalized cover letter"""
async def validate_cover_letter(self, cover_letter: CoverLetter) -> ValidationReport:
"""Ensure cover letter quality and authenticity"""
```
---
## 📄 Document Management
### Document Storage & Retrieval
```python
class DocumentService:
"""Handle document storage and retrieval"""
async def save_document(self, user_id: str, app_id: str, doc_type: str, content: str) -> None:
"""Save generated document (research, resume, cover letter)"""
async def get_document(self, user_id: str, app_id: str, doc_type: str) -> Document:
"""Retrieve document for viewing/editing"""
async def update_document(self, user_id: str, app_id: str, doc_type: str, content: str) -> None:
"""Update document after user editing"""
async def get_all_documents(self, user_id: str, app_id: str) -> ApplicationDocuments:
"""Get all documents for an application"""
```
### Document Models
```python
class Document(BaseModel):
"""Base document model"""
id: str
application_id: str
document_type: str # research_report, optimized_resume, cover_letter
content: str
created_at: datetime
updated_at: datetime
class ApplicationDocuments(BaseModel):
"""All documents for an application"""
research_report: Optional[Document] = None
optimized_resume: Optional[Document] = None
cover_letter: Optional[Document] = None
```
### Database Schema (Documents)
```sql
CREATE TABLE documents (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
application_id UUID REFERENCES applications(id) ON DELETE CASCADE,
document_type VARCHAR(50) NOT NULL,
content TEXT NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
ALTER TABLE documents ENABLE ROW LEVEL SECURITY;
CREATE POLICY user_documents_policy ON documents
FOR ALL TO application_user
USING (
application_id IN (
SELECT id FROM applications
WHERE user_id = current_setting('app.current_user_id')::UUID
)
);
```
---
## 🎨 Frontend Interface (Dash + Mantine)
### Main Application Layout
```python
class JobForgeApp:
"""Main Dash application layout"""
def create_layout(self):
return dmc.MantineProvider([
dmc.AppShell([
dmc.Navbar([
ApplicationSidebar()
], width={"base": 300}),
dmc.Main([
ApplicationTopBar(),
MainContent()
])
])
])
```
### Application Sidebar
```python
class ApplicationSidebar:
"""Sidebar with applications list and navigation"""
def render(self, user_id: str):
return dmc.Stack([
# New Application Button
dmc.Button(
" New Application",
id="new-app-btn",
fullWidth=True,
variant="filled"
),
# Applications List
dmc.Title("Applications", order=4),
dmc.ScrollArea([
ApplicationCard(app) for app in self.get_user_applications(user_id)
]),
# Resume Library Section
dmc.Divider(),
dmc.Title("Resume Library", order=4),
ResumeLibrarySection()
])
class ApplicationCard:
"""Individual application card in sidebar"""
def render(self, application: Application):
return dmc.Card([
dmc.Group([
dmc.Text(application.company_name, weight=600),
StatusBadge(application.status)
]),
dmc.Text(application.role_title, size="sm", color="dimmed"),
dmc.Text(application.created_at.strftime("%Y-%m-%d"), size="xs")
], id=f"app-card-{application.id}")
```
### Application Top Bar Navigation
```python
class ApplicationTopBar:
"""Top navigation bar for application phases"""
def render(self, application: Application):
return dmc.Group([
# Phase Navigation Buttons
PhaseButton("Research", "research", application.research_completed),
PhaseButton("Resume", "resume", application.resume_optimized),
PhaseButton("Cover Letter", "cover_letter", application.cover_letter_generated),
# Application Actions
dmc.Spacer(),
dmc.ActionIcon(
DashIconify(icon="tabler:settings"),
id="app-settings-btn"
)
])
class PhaseButton:
"""Navigation button for each phase"""
def render(self, label: str, phase: str, completed: bool):
icon = "tabler:check" if completed else "tabler:clock"
color = "green" if completed else "gray"
return dmc.Button([
DashIconify(icon=icon),
dmc.Text(label, ml="xs")
],
variant="subtle" if not completed else "filled",
color=color,
id=f"phase-{phase}-btn"
)
```
### Document Editor Interface
```python
class DocumentEditor:
"""Markdown document editor with preview"""
def render(self, document: Document):
return dmc.Container([
dmc.Grid([
# Editor Column
dmc.Col([
dmc.Title(f"Edit {document.document_type.replace('_', ' ').title()}", order=3),
dmc.Textarea(
value=document.content,
placeholder="Document content...",
minRows=20,
autosize=True,
id=f"editor-{document.document_type}"
),
dmc.Group([
dmc.Button("Save Changes", id="save-btn"),
dmc.Button("Cancel", variant="outline", id="cancel-btn")
])
], span=6),
# Preview Column
dmc.Col([
dmc.Title("Preview", order=3),
dmc.Container([
dcc.Markdown(document.content, id="preview-content")
], style={"border": "1px solid #e0e0e0", "padding": "1rem", "minHeight": "500px"})
], span=6)
])
])
```
---
## 🗄️ MVP Database Schema
### Complete Database Setup
```sql
-- Enable required extensions
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE EXTENSION IF NOT EXISTS vector;
-- Users table
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
email VARCHAR(255) UNIQUE NOT NULL,
password_hash VARCHAR(255) NOT NULL,
full_name VARCHAR(255) NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
-- Applications table
CREATE TABLE applications (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID REFERENCES users(id) ON DELETE CASCADE,
name VARCHAR(255) NOT NULL,
company_name VARCHAR(255) NOT NULL,
role_title VARCHAR(255) NOT NULL,
job_url TEXT,
job_description TEXT NOT NULL,
location VARCHAR(255),
priority_level VARCHAR(20) DEFAULT 'medium',
status VARCHAR(50) DEFAULT 'draft',
research_completed BOOLEAN DEFAULT FALSE,
resume_optimized BOOLEAN DEFAULT FALSE,
cover_letter_generated BOOLEAN DEFAULT FALSE,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
-- Documents table
CREATE TABLE documents (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
application_id UUID REFERENCES applications(id) ON DELETE CASCADE,
document_type VARCHAR(50) NOT NULL,
content TEXT NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW(),
UNIQUE(application_id, document_type)
);
-- Resume library table
CREATE TABLE user_resumes (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID REFERENCES users(id) ON DELETE CASCADE,
name VARCHAR(255) NOT NULL,
content TEXT NOT NULL,
focus_area VARCHAR(100),
is_primary BOOLEAN DEFAULT FALSE,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
-- Basic vector embeddings (for future enhancement)
CREATE TABLE document_embeddings (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
document_id UUID REFERENCES documents(id) ON DELETE CASCADE,
embedding vector(1536),
created_at TIMESTAMP DEFAULT NOW()
);
-- Row Level Security
ALTER TABLE users ENABLE ROW LEVEL SECURITY;
ALTER TABLE applications ENABLE ROW LEVEL SECURITY;
ALTER TABLE documents ENABLE ROW LEVEL SECURITY;
ALTER TABLE user_resumes ENABLE ROW LEVEL SECURITY;
-- Security policies
CREATE POLICY user_own_data ON applications FOR ALL USING (user_id = current_setting('app.current_user_id')::UUID);
CREATE POLICY user_own_documents ON documents FOR ALL USING (
application_id IN (SELECT id FROM applications WHERE user_id = current_setting('app.current_user_id')::UUID)
);
CREATE POLICY user_own_resumes ON user_resumes FOR ALL USING (user_id = current_setting('app.current_user_id')::UUID);
```
---
## 🚀 MVP Development Plan
### Development Phases
#### **Week 1-2: Foundation Setup**
- Docker development environment
- PostgreSQL database with basic schema
- FastAPI backend with authentication endpoints
- Basic Dash + Mantine frontend structure
#### **Week 3-4: Core Application Module**
- Application creation and listing
- Database integration with user isolation
- Basic sidebar and navigation UI
- Application status tracking
#### **Week 5-6: AI Workflow Implementation**
- Research Agent with Claude integration
- Resume Optimizer with portfolio handling
- Cover Letter Generator with user context
- Document storage and retrieval system
#### **Week 7-8: Frontend Polish & Integration**
- Document editor with markdown support
- Real-time status updates during AI processing
- Phase navigation and progress tracking
- Error handling and user feedback
### MVP Success Criteria
- ✅ User can register/login securely
- ✅ User can create job applications with description/URL
- ✅ AI generates research report automatically
- ✅ AI optimizes resume based on job requirements
- ✅ AI generates cover letter with user context
- ✅ User can view and edit all generated documents
- ✅ Smooth navigation between application phases
- ✅ Data persisted securely with user isolation
---
## 🐳 Docker Development Setup
### Development Environment
```yaml
# docker-compose.yml
version: '3.8'
services:
postgres:
image: pgvector/pgvector:pg16
environment:
POSTGRES_DB: jobforge_mvp
POSTGRES_USER: jobforge_user
POSTGRES_PASSWORD: jobforge_password
ports:
- "5432:5432"
volumes:
- postgres_data:/var/lib/postgresql/data
- ./database/init.sql:/docker-entrypoint-initdb.d/init.sql
backend:
build:
context: .
dockerfile: Dockerfile.backend
ports:
- "8000:8000"
environment:
- DATABASE_URL=postgresql+asyncpg://jobforge_user:jobforge_password@postgres:5432/jobforge_mvp
- CLAUDE_API_KEY=${CLAUDE_API_KEY}
- OPENAI_API_KEY=${OPENAI_API_KEY}
volumes:
- ./src:/app/src
depends_on:
- postgres
command: uvicorn src.backend.main:app --host 0.0.0.0 --port 8000 --reload
frontend:
build:
context: .
dockerfile: Dockerfile.frontend
ports:
- "8501:8501"
environment:
- BACKEND_URL=http://backend:8000
volumes:
- ./src/frontend:/app/src/frontend
depends_on:
- backend
command: python src/frontend/main.py
volumes:
postgres_data:
```
---
## 📁 MVP Project Structure
```
jobforge-mvp/
├── docker-compose.yml
├── Dockerfile.backend
├── Dockerfile.frontend
├── requirements-backend.txt
├── requirements-frontend.txt
├── .env.example
├── database/
│ └── init.sql
├── src/
│ ├── backend/
│ │ ├── main.py
│ │ ├── api/
│ │ │ ├── auth.py
│ │ │ ├── applications.py
│ │ │ ├── documents.py
│ │ │ └── processing.py
│ │ ├── services/
│ │ │ ├── auth_service.py
│ │ │ ├── application_service.py
│ │ │ ├── document_service.py
│ │ │ └── ai_orchestrator.py
│ │ ├── database/
│ │ │ ├── connection.py
│ │ │ └── models.py
│ │ └── models/
│ │ ├── requests.py
│ │ └── responses.py
│ ├── frontend/
│ │ ├── main.py
│ │ ├── components/
│ │ │ ├── sidebar.py
│ │ │ ├── topbar.py
│ │ │ └── editor.py
│ │ ├── pages/
│ │ │ ├── login.py
│ │ │ ├── dashboard.py
│ │ │ └── application.py
│ │ └── api_client/
│ │ └── client.py
│ ├── agents/
│ │ ├── research_agent.py
│ │ ├── resume_optimizer.py
│ │ ├── cover_letter_generator.py
│ │ └── claude_client.py
│ └── helpers/
│ ├── validators.py
│ └── formatters.py
└── user_data/
└── resumes/
```
---
*This MVP architecture focuses on delivering the core job application workflow with essential features. It establishes the foundation for Phase 2 development while providing immediate value for personal job application management and concept validation.*

View File

@@ -0,0 +1,951 @@
# JobForge MVP - Team Management Guide
**Version:** 1.0.0
**Project:** JobForge MVP Development
**Timeline:** 8 Weeks (July - August 2025)
**Team Structure:** 6 Specialized Roles
**Last Updated:** July 2025
---
## 📋 Table of Contents
1. [Team Structure & Hierarchy](#team-structure--hierarchy)
2. [Role Definitions](#role-definitions)
3. [Communication Protocols](#communication-protocols)
4. [Development Process](#development-process)
5. [Quality Standards](#quality-standards)
6. [Meeting Framework](#meeting-framework)
7. [Documentation Standards](#documentation-standards)
8. [Escalation Procedures](#escalation-procedures)
9. [Performance Metrics](#performance-metrics)
10. [Standard Templates](#standard-templates)
---
## 🏗️ Team Structure & Hierarchy
### Organizational Chart
```mermaid
graph TD
PM[Project Manager] --> ARCH[Architect & Orchestrator]
PM --> BE[Backend Team]
PM --> FE[Frontend Team]
PM --> AI[AI Engineering Team]
PM --> DO[DevOps & Integration Team]
ARCH --> BE
ARCH --> FE
ARCH --> AI
ARCH --> DO
BE --> DO
FE --> DO
AI --> BE
```
### Reporting Structure
| Level | Role | Reports To | Direct Reports |
|-------|------|------------|----------------|
| **L1** | Project Manager | Stakeholders/CEO | All team leads |
| **L2** | Architect & Orchestrator | Project Manager | Technical guidance to all teams |
| **L2** | Backend Team Lead | Project Manager | Backend developers |
| **L2** | Frontend Team Lead | Project Manager | Frontend developers |
| **L2** | AI Engineering Team Lead | Project Manager | AI engineers |
| **L2** | DevOps & Integration Team Lead | Project Manager | DevOps engineers |
### Authority Matrix
| Decision Type | Project Manager | Architect | Team Leads | Team Members |
|---------------|-----------------|-----------|------------|--------------|
| Project Scope | **Decide** | Consult | Inform | Inform |
| Technical Architecture | Consult | **Decide** | Consult | Inform |
| Implementation Details | Inform | Consult | **Decide** | Responsible |
| Resource Allocation | **Decide** | Consult | Consult | Inform |
| Quality Standards | Consult | **Decide** | Responsible | Responsible |
| Release Decisions | **Decide** | Consult | Inform | Inform |
---
## 👥 Role Definitions
### 1. Project Manager
#### **Core Responsibilities**
- **Project Planning:** Define sprint goals, manage timeline, allocate resources
- **Risk Management:** Identify, assess, and mitigate project risks
- **Stakeholder Communication:** Regular status updates and expectation management
- **Team Coordination:** Facilitate cross-team collaboration and resolve blockers
- **Progress Tracking:** Monitor deliverables, milestones, and budget
#### **Key Deliverables**
- Weekly project status reports
- Sprint planning and retrospective facilitation
- Risk register and mitigation plans
- Resource allocation and capacity planning
- Stakeholder communication and updates
#### **Required Skills**
- Agile/Scrum methodology expertise
- Technical project management experience
- Excellent communication and leadership skills
- Risk assessment and mitigation
- Budget and resource management
#### **Daily Activities**
- **Morning:** Review overnight progress, check for blockers
- **Mid-day:** Attend standups, facilitate cross-team communication
- **Evening:** Update project tracking, prepare status reports
#### **Success Metrics**
- On-time delivery of sprint goals (100% target)
- Team velocity consistency (±15% variance)
- Stakeholder satisfaction scores (>8/10)
- Risk mitigation effectiveness (>90% issues resolved)
---
### 2. Architect & Orchestrator
#### **Core Responsibilities**
- **System Architecture:** Define and maintain overall system design
- **Technical Standards:** Establish coding standards, patterns, and best practices
- **Architecture Reviews:** Conduct regular technical reviews and approvals
- **Cross-Team Alignment:** Ensure technical consistency across all teams
- **Technology Decisions:** Evaluate and approve technology choices
#### **Key Deliverables**
- System architecture documentation and diagrams
- Technical standards and coding guidelines
- Architecture review reports and approvals
- Technology evaluation and recommendation reports
- Technical debt assessment and remediation plans
#### **Required Skills**
- Full-stack architecture experience
- Deep knowledge of Python, PostgreSQL, Docker
- API design and microservices architecture
- Code review and quality assessment
- Technical leadership and mentoring
#### **Daily Activities**
- **Morning:** Review pull requests and architecture compliance
- **Mid-day:** Conduct architecture reviews and technical discussions
- **Evening:** Update architecture documentation and standards
#### **Review Schedule**
- **Daily:** Code review and PR approvals
- **Weekly:** Architecture compliance review
- **Bi-weekly:** Technical debt assessment
- **Sprint End:** Architecture retrospective and improvements
#### **Success Metrics**
- Architecture compliance score (>95%)
- Technical debt ratio (<10%)
- Code review turnaround time (<4 hours)
- Cross-team technical consistency (>90%)
---
### 3. Backend Team
#### **Core Responsibilities**
- **API Development:** Build FastAPI endpoints per specification
- **Database Design:** Implement PostgreSQL schema with RLS policies
- **Business Logic:** Develop core application services and workflows
- **Integration:** Integrate with AI services and external APIs
- **Testing:** Write comprehensive unit and integration tests
#### **Key Deliverables**
- REST API endpoints with full documentation
- Database schema with migrations and seed data
- Business logic services and domain models
- API integration with AI services
- Comprehensive test suites (>80% coverage)
#### **Required Skills**
- Advanced Python (FastAPI, SQLAlchemy, AsyncIO)
- PostgreSQL database design and optimization
- REST API design and development
- Testing frameworks (pytest, mocking)
- Docker and containerization
#### **Team Structure**
- **Backend Team Lead:** Technical leadership, architecture compliance
- **Senior Backend Developer:** Complex features, AI integration
- **Backend Developer:** CRUD operations, testing, documentation
#### **Daily Activities**
- **Morning:** Review API requirements and database design
- **Mid-day:** Implement endpoints and business logic
- **Evening:** Write tests and update documentation
#### **Handoff Requirements**
- **To Frontend:** Complete API documentation with examples
- **To AI Team:** Integration endpoints and data models
- **To DevOps:** Docker configuration and deployment requirements
#### **Success Metrics**
- API endpoint completion rate (100% per sprint)
- Test coverage percentage (>80%)
- API response time (<500ms for CRUD operations)
- Bug density (<2 bugs per 1000 lines of code)
---
### 4. Frontend Team
#### **Core Responsibilities**
- **UI Development:** Build Dash + Mantine user interface components
- **User Experience:** Create intuitive and responsive user interactions
- **API Integration:** Implement frontend API client and data management
- **Visual Design:** Ensure professional and modern visual design
- **Testing:** Develop frontend testing strategies and implementation
#### **Key Deliverables**
- Complete user interface with all MVP features
- Responsive design for desktop and mobile
- API client library with error handling
- Component library and design system
- User experience testing and optimization
#### **Required Skills**
- Advanced Python (Dash, Plotly, component libraries)
- Modern web technologies (HTML5, CSS3, JavaScript)
- UI/UX design principles and responsive design
- API integration and state management
- Frontend testing and debugging
#### **Team Structure**
- **Frontend Team Lead:** UI architecture, UX decisions
- **Senior Frontend Developer:** Complex components, API integration
- **Frontend Developer:** Component development, styling, testing
#### **Daily Activities**
- **Morning:** Review UI requirements and design specifications
- **Mid-day:** Develop components and integrate with backend APIs
- **Evening:** Test user interactions and update documentation
#### **Handoff Requirements**
- **From Backend:** API documentation and endpoint availability
- **To DevOps:** Frontend build configuration and deployment needs
- **To PM:** User acceptance testing and demo preparation
#### **Success Metrics**
- Feature completion rate (100% per sprint)
- UI responsiveness score (>95% across devices)
- User experience satisfaction (>8/10 in testing)
- Frontend error rate (<1% of user interactions)
---
### 5. AI Engineering Team
#### **Core Responsibilities**
- **Prompt Engineering:** Develop and optimize Claude Sonnet 4 prompts
- **AI Integration:** Build AI agents for research, resume, and cover letter generation
- **Vector Operations:** Implement OpenAI embeddings and similarity search
- **Performance Optimization:** Optimize AI response times and accuracy
- **Quality Assurance:** Test AI outputs for consistency and relevance
#### **Key Deliverables**
- Research Agent with job analysis capabilities
- Resume Optimizer with multi-resume synthesis
- Cover Letter Generator with voice preservation
- Vector database integration for semantic search
- AI performance monitoring and optimization tools
#### **Required Skills**
- Advanced prompt engineering and LLM optimization
- Python AI/ML libraries (OpenAI, Anthropic APIs)
- Vector databases and semantic search
- Natural language processing and analysis
- AI testing and quality assurance methods
#### **Team Structure**
- **AI Team Lead:** AI strategy, prompt architecture
- **Senior AI Engineer:** Complex AI workflows, vector integration
- **AI Engineer:** Prompt development, testing, optimization
#### **Daily Activities**
- **Morning:** Review AI requirements and prompt specifications
- **Mid-day:** Develop and test AI agents and prompts
- **Evening:** Optimize performance and document AI behaviors
#### **Handoff Requirements**
- **To Backend:** AI service integration requirements and APIs
- **From Architect:** AI workflow specifications and quality criteria
- **To PM:** AI performance metrics and capability demonstrations
#### **Success Metrics**
- AI response accuracy (>90% relevance score)
- Processing time (<30 seconds per AI operation)
- Prompt effectiveness (>85% user satisfaction)
- AI service uptime (>99.5% availability)
---
### 6. DevOps & Integration Team
#### **Core Responsibilities**
- **Infrastructure Management:** Docker Compose orchestration and optimization
- **CI/CD Pipeline:** Gitea workflows, automated testing, and deployment
- **Integration Testing:** End-to-end system integration and testing
- **Quality Assurance:** Enforce quality standards and code review processes
- **Documentation Management:** Maintain project documentation and knowledge base
#### **Key Deliverables**
- Complete Docker development environment
- Automated CI/CD pipelines with quality gates
- Integration testing suite and monitoring
- Quality standards documentation and enforcement
- Production deployment configuration and monitoring
#### **Required Skills**
- Docker and container orchestration
- CI/CD pipeline design and implementation
- Automated testing and quality assurance
- System integration and monitoring
- Technical documentation and knowledge management
#### **Team Structure**
- **DevOps & Integration Lead:** Infrastructure architecture, quality standards
- **Senior DevOps Engineer:** CI/CD pipelines, production deployment
- **Integration Engineer:** System integration, testing, documentation
#### **Daily Activities**
- **Morning:** Review system health, CI/CD pipeline status
- **Mid-day:** Support integration needs, resolve deployment issues
- **Evening:** Update documentation, monitor quality metrics
#### **Handoff Requirements**
- **From All Teams:** Code, configuration, and deployment requirements
- **To PM:** System status, quality metrics, deployment readiness
- **To Architect:** Infrastructure compliance and optimization recommendations
#### **Success Metrics**
- CI/CD pipeline success rate (>95%)
- Integration test pass rate (>98%)
- System uptime (>99.9% during development)
- Documentation completeness (>90% coverage)
---
## 📞 Communication Protocols
### Communication Matrix
| Communication Type | Frequency | Participants | Duration | Format |
|-------------------|-----------|--------------|----------|--------|
| **Daily Standup** | Daily | All team leads + PM | 15 min | Synchronous |
| **Team Standup** | Daily | Team members + Team lead | 10 min | Synchronous |
| **Architecture Review** | Weekly | Architect + All leads | 30 min | Synchronous |
| **Sprint Planning** | Weekly | All team leads + PM | 60 min | Synchronous |
| **Sprint Retrospective** | Weekly | All team leads + PM | 45 min | Synchronous |
| **Technical Sync** | As needed | Relevant teams | 30 min | Synchronous |
| **Status Updates** | Weekly | PM + Stakeholders | 30 min | Synchronous/Async |
### Communication Guidelines
#### **Daily Standup Format**
Each team lead reports:
1. **Yesterday:** What was completed
2. **Today:** What will be worked on
3. **Blockers:** Any impediments or dependencies
4. **Risks:** Emerging risks or concerns
#### **Cross-Team Communication Rules**
- **Backend ↔ Frontend:** API changes require 24-hour notice
- **AI ↔ Backend:** Integration requirements must be documented
- **DevOps ↔ All Teams:** Infrastructure changes require approval
- **Architect ↔ All Teams:** Technical decisions require consultation
#### **Escalation Matrix**
| Issue Level | Response Time | Escalation Path |
|-------------|---------------|-----------------|
| **Low** | 24 hours | Team Lead → PM |
| **Medium** | 4 hours | Team Lead → PM → Architect |
| **High** | 1 hour | Team Lead → PM + Architect |
| **Critical** | 30 minutes | All leads + PM + Architect |
---
## 🔄 Development Process
### Sprint Structure (1-Week Sprints)
#### **Monday: Sprint Planning**
- **9:00 AM:** Sprint planning meeting (60 min)
- **10:30 AM:** Team breakouts for task estimation
- **11:30 AM:** Cross-team dependency identification
- **12:00 PM:** Sprint commitment and kick-off
#### **Tuesday-Thursday: Development**
- **9:00 AM:** Daily standup (15 min)
- **Development work according to team schedules**
- **4:00 PM:** Daily progress check-in
- **As needed:** Technical sync meetings
#### **Friday: Review & Retrospective**
- **9:00 AM:** Sprint demo preparation
- **10:00 AM:** Sprint review and demo (60 min)
- **11:30 AM:** Sprint retrospective (45 min)
- **1:00 PM:** Next sprint preparation
- **3:00 PM:** Week wrap-up and documentation
### Definition of Done
#### **Backend Features**
- [ ] API endpoints implemented per specification
- [ ] Unit tests written with >80% coverage
- [ ] Integration tests passing
- [ ] API documentation updated
- [ ] Code reviewed and approved by Architect
- [ ] Database migrations tested
- [ ] Error handling implemented
#### **Frontend Features**
- [ ] UI components implemented per design
- [ ] Responsive design tested on multiple devices
- [ ] API integration working correctly
- [ ] User acceptance criteria met
- [ ] Code reviewed and approved
- [ ] Documentation updated
- [ ] Browser compatibility tested
#### **AI Features**
- [ ] Prompts developed and optimized
- [ ] AI agents tested for accuracy and performance
- [ ] Integration with backend services working
- [ ] Performance benchmarks met
- [ ] Error handling and fallbacks implemented
- [ ] Documentation and examples provided
- [ ] Quality assurance testing completed
#### **DevOps Features**
- [ ] Infrastructure changes deployed and tested
- [ ] CI/CD pipelines updated and working
- [ ] Documentation updated
- [ ] Security review completed
- [ ] Performance impact assessed
- [ ] Rollback procedures tested
- [ ] Monitoring and alerting configured
---
## 🎯 Quality Standards
### Code Quality Requirements
#### **General Standards**
- **Type Hints:** Required for all public functions and methods
- **Documentation:** Docstrings for all classes and public methods
- **Testing:** Minimum 80% code coverage for backend, 70% for frontend
- **Code Review:** All changes require approval from team lead + architect
- **Security:** No hardcoded secrets, proper input validation
#### **Backend Standards**
```python
# Example of required code quality
from typing import Optional, List
from pydantic import BaseModel
class ApplicationService:
"""Service for managing job applications with proper error handling."""
async def create_application(
self,
user_id: str,
application_data: CreateApplicationRequest
) -> Application:
"""
Create a new job application for the specified user.
Args:
user_id: UUID of the authenticated user
application_data: Validated application creation data
Returns:
Application: Created application with generated ID
Raises:
ValidationError: If application data is invalid
DatabaseError: If database operation fails
"""
# Implementation with proper error handling
pass
```
#### **Frontend Standards**
- **Component Documentation:** Clear docstrings for all components
- **Props Validation:** Type hints and validation for all component props
- **Error Boundaries:** Proper error handling for API failures
- **Accessibility:** WCAG 2.1 AA compliance for all UI components
- **Performance:** Components should render in <100ms
#### **AI Standards**
- **Prompt Documentation:** Clear documentation of prompt purpose and expected outputs
- **Error Handling:** Graceful degradation when AI services are unavailable
- **Performance Monitoring:** Response time and accuracy tracking
- **Quality Assurance:** Systematic testing of AI outputs for consistency
### Quality Gates
#### **Pre-Commit Checks**
- [ ] Code formatting (Black, isort)
- [ ] Type checking (mypy)
- [ ] Linting (flake8, pylint)
- [ ] Security scanning (bandit)
- [ ] Test execution (pytest)
#### **Pull Request Checks**
- [ ] All CI/CD pipeline checks pass
- [ ] Code coverage requirements met
- [ ] Architecture compliance verified
- [ ] Security review completed
- [ ] Documentation updated
- [ ] Performance impact assessed
#### **Sprint Completion Checks**
- [ ] All features meet Definition of Done
- [ ] Integration testing passes
- [ ] Performance benchmarks met
- [ ] Security review completed
- [ ] Documentation complete and accurate
- [ ] Demo preparation completed
---
## 📅 Meeting Framework
### Meeting Templates
#### **Daily Standup Template**
```
Date: [Date]
Sprint: [Sprint Number]
Facilitator: [Project Manager]
Team Updates:
□ Backend Team - [Lead Name]
- Completed:
- Today:
- Blockers:
□ Frontend Team - [Lead Name]
- Completed:
- Today:
- Blockers:
□ AI Team - [Lead Name]
- Completed:
- Today:
- Blockers:
□ DevOps Team - [Lead Name]
- Completed:
- Today:
- Blockers:
Cross-Team Dependencies:
- [Dependency 1]
- [Dependency 2]
Action Items:
- [Action Item 1] - Owner: [Name] - Due: [Date]
- [Action Item 2] - Owner: [Name] - Due: [Date]
```
#### **Sprint Planning Template**
```
Sprint Planning - Sprint [Number]
Date: [Date]
Duration: [Start Date] to [End Date]
Sprint Goal:
[Clear, concise statement of what will be achieved]
Team Capacity:
- Backend Team: [X] story points
- Frontend Team: [X] story points
- AI Team: [X] story points
- DevOps Team: [X] story points
Selected Stories:
□ [Story 1] - [Team] - [Points] - [Priority]
□ [Story 2] - [Team] - [Points] - [Priority]
□ [Story 3] - [Team] - [Points] - [Priority]
Dependencies Identified:
- [Dependency 1] - Teams: [A] → [B] - Risk: [Low/Medium/High]
- [Dependency 2] - Teams: [A] → [B] - Risk: [Low/Medium/High]
Risks and Mitigation:
- [Risk 1] - Probability: [%] - Impact: [High/Medium/Low] - Mitigation: [Plan]
- [Risk 2] - Probability: [%] - Impact: [High/Medium/Low] - Mitigation: [Plan]
Sprint Commitment:
Team leads confirm commitment to sprint goal and deliverables.
□ Backend Team Lead - [Name]
□ Frontend Team Lead - [Name]
□ AI Team Lead - [Name]
□ DevOps Team Lead - [Name]
```
#### **Architecture Review Template**
```
Architecture Review - Week [Number]
Date: [Date]
Reviewer: [Architect Name]
Components Reviewed:
□ Backend API Design
- Compliance: [Green/Yellow/Red]
- Issues: [List any issues]
- Recommendations: [List recommendations]
□ Frontend Architecture
- Compliance: [Green/Yellow/Red]
- Issues: [List any issues]
- Recommendations: [List recommendations]
□ AI Integration
- Compliance: [Green/Yellow/Red]
- Issues: [List any issues]
- Recommendations: [List recommendations]
□ Infrastructure Design
- Compliance: [Green/Yellow/Red]
- Issues: [List any issues]
- Recommendations: [List recommendations]
Technical Debt Assessment:
- Current Level: [Low/Medium/High]
- Priority Items: [List top 3 items]
- Remediation Plan: [Summary of approach]
Decisions Made:
- [Decision 1] - Rationale: [Explanation]
- [Decision 2] - Rationale: [Explanation]
Action Items:
- [Action 1] - Owner: [Name] - Due: [Date]
- [Action 2] - Owner: [Name] - Due: [Date]
```
---
## 📋 Documentation Standards
### Required Documentation
#### **API Documentation**
- **OpenAPI Specification:** Complete API documentation with examples
- **Integration Guide:** How to integrate with each API endpoint
- **Error Handling:** Comprehensive error codes and responses
- **Authentication:** Security requirements and implementation
#### **Code Documentation**
- **README Files:** Clear setup and usage instructions
- **Inline Comments:** Complex logic explanation and business rules
- **Architecture Decisions:** ADR (Architecture Decision Records)
- **Deployment Guide:** Step-by-step deployment instructions
#### **Process Documentation**
- **Team Onboarding:** New team member setup guide
- **Development Workflow:** Git branching and development process
- **Quality Standards:** Code quality and review requirements
- **Troubleshooting:** Common issues and resolution steps
### Documentation Review Process
#### **Weekly Documentation Review**
- **Owner:** DevOps & Integration Team Lead
- **Participants:** All team leads
- **Duration:** 30 minutes
- **Agenda:** Review documentation completeness and accuracy
#### **Documentation Standards**
- **Format:** Markdown files in `/docs` directory
- **Structure:** Consistent headings, table of contents, examples
- **Updates:** Documentation updated with each feature delivery
- **Review:** All documentation changes require peer review
---
## 🚨 Escalation Procedures
### Issue Classification
#### **Priority Levels**
| Priority | Response Time | Definition | Examples |
|----------|---------------|------------|----------|
| **P0 - Critical** | 30 minutes | System down, security breach | Database crash, API completely down |
| **P1 - High** | 2 hours | Major feature broken, blocking | Authentication broken, AI services down |
| **P2 - Medium** | 8 hours | Minor feature issues, performance | Slow API responses, UI bugs |
| **P3 - Low** | 24 hours | Enhancement requests, documentation | Feature improvements, doc updates |
### Escalation Flow
#### **Technical Issues**
```
Developer → Team Lead → Architect → Project Manager → Stakeholders
```
#### **Resource/Timeline Issues**
```
Team Lead → Project Manager → Stakeholders
```
#### **Quality/Standards Issues**
```
Team Member → Team Lead → Architect → Project Manager
```
#### **Cross-Team Conflicts**
```
Team Leads → Project Manager → Architect (if technical) → Resolution
```
### Crisis Management
#### **Critical Issue Response**
1. **Immediate (0-15 min):**
- Issue reporter creates critical incident ticket
- Notify Project Manager and Architect immediately
- Form incident response team
2. **Short-term (15-60 min):**
- Assess impact and root cause
- Implement temporary workaround if possible
- Communicate status to stakeholders
3. **Resolution (1+ hours):**
- Develop and implement permanent fix
- Test fix thoroughly in staging environment
- Deploy fix and monitor system health
- Conduct post-incident review
---
## 📊 Performance Metrics
### Team Performance Metrics
#### **Delivery Metrics**
| Metric | Target | Measurement | Frequency |
|--------|--------|-------------|-----------|
| **Sprint Goal Achievement** | 100% | Goals completed vs planned | Weekly |
| **Story Point Velocity** | ±15% variance | Points delivered per sprint | Weekly |
| **Feature Delivery** | On schedule | Features completed on time | Weekly |
| **Defect Rate** | <5% | Bugs found post-delivery | Weekly |
#### **Quality Metrics**
| Metric | Target | Measurement | Frequency |
|--------|--------|-------------|-----------|
| **Code Coverage** | >80% | Automated test coverage | Daily |
| **Code Review Time** | <4 hours | Time from PR to approval | Daily |
| **Build Success Rate** | >95% | CI/CD pipeline success | Daily |
| **Documentation Coverage** | >90% | Features documented | Weekly |
#### **Team Health Metrics**
| Metric | Target | Measurement | Frequency |
|--------|--------|-------------|-----------|
| **Team Satisfaction** | >8/10 | Weekly team survey | Weekly |
| **Collaboration Score** | >8/10 | Cross-team effectiveness | Weekly |
| **Knowledge Sharing** | >3 sessions/week | Tech talks, reviews | Weekly |
| **Blockers Resolution** | <24 hours | Time to resolve blockers | Daily |
### Individual Performance Metrics
#### **Backend Team**
- API endpoint delivery rate (100% per sprint)
- Code quality score (>90%)
- Test coverage percentage (>80%)
- Code review participation rate (100%)
#### **Frontend Team**
- UI component completion rate (100% per sprint)
- User experience satisfaction (>8/10)
- Browser compatibility score (>95%)
- Design system compliance (>90%)
#### **AI Team**
- AI model accuracy (>90%)
- Prompt optimization rate (>85% user satisfaction)
- Processing time improvements (weekly optimization)
- AI service uptime (>99.5%)
#### **DevOps Team**
- Infrastructure uptime (>99.9%)
- CI/CD pipeline reliability (>95% success rate)
- Documentation completeness (>90%)
- Security compliance score (100%)
---
## 📝 Standard Templates
### Handoff Document Template
```markdown
# Team Handoff Document
**From Team:** [Source Team]
**To Team:** [Destination Team]
**Date:** [Date]
**Sprint:** [Sprint Number]
## Deliverables
- [Deliverable 1] - Status: [Complete/Partial/Pending]
- [Deliverable 2] - Status: [Complete/Partial/Pending]
## Technical Specifications
- **API Endpoints:** [List with documentation links]
- **Data Models:** [List with schema definitions]
- **Configuration:** [Environment variables, settings]
- **Dependencies:** [External services, libraries]
## Testing Information
- **Test Coverage:** [Percentage]
- **Test Results:** [Link to test report]
- **Known Issues:** [List any known problems]
- **Testing Instructions:** [How to test the deliverables]
## Documentation
- **Technical Docs:** [Links to relevant documentation]
- **API Documentation:** [Link to API docs]
- **Setup Instructions:** [How to run/deploy]
- **Troubleshooting:** [Common issues and solutions]
## Next Steps
- [Action item 1] - Owner: [Name] - Due: [Date]
- [Action item 2] - Owner: [Name] - Due: [Date]
## Contact Information
- **Primary Contact:** [Name] - [Email] - [Slack/Teams]
- **Secondary Contact:** [Name] - [Email] - [Slack/Teams]
## Sign-off
- **Source Team Lead:** [Name] - [Date] - [Signature]
- **Destination Team Lead:** [Name] - [Date] - [Signature]
```
### Status Report Template
```markdown
# Weekly Status Report
**Week of:** [Date Range]
**Sprint:** [Sprint Number]
**Report Date:** [Date]
**Prepared by:** [Project Manager Name]
## Executive Summary
[2-3 sentence summary of week's progress and status]
## Sprint Progress
- **Sprint Goal:** [Goal statement]
- **Completion Rate:** [X]% ([Y] of [Z] story points completed)
- **On Track for Sprint Goal:** [Yes/No/At Risk]
## Team Status
### Backend Team
- **Completed:** [List major accomplishments]
- **In Progress:** [Current work]
- **Planned:** [Next week's priorities]
- **Blockers:** [Any impediments]
- **Health:** [Green/Yellow/Red]
### Frontend Team
- **Completed:** [List major accomplishments]
- **In Progress:** [Current work]
- **Planned:** [Next week's priorities]
- **Blockers:** [Any impediments]
- **Health:** [Green/Yellow/Red]
### AI Engineering Team
- **Completed:** [List major accomplishments]
- **In Progress:** [Current work]
- **Planned:** [Next week's priorities]
- **Blockers:** [Any impediments]
- **Health:** [Green/Yellow/Red]
### DevOps & Integration Team
- **Completed:** [List major accomplishments]
- **In Progress:** [Current work]
- **Planned:** [Next week's priorities]
- **Blockers:** [Any impediments]
- **Health:** [Green/Yellow/Red]
## Key Metrics
| Metric | Target | Actual | Status |
|--------|--------|---------|--------|
| Sprint Velocity | [X] points | [Y] points | [Green/Yellow/Red] |
| Code Coverage | >80% | [X]% | [Green/Yellow/Red] |
| Build Success Rate | >95% | [X]% | [Green/Yellow/Red] |
| Team Satisfaction | >8/10 | [X]/10 | [Green/Yellow/Red] |
## Risks and Issues
| Risk/Issue | Impact | Probability | Mitigation | Owner | Due Date |
|------------|--------|-------------|------------|--------|----------|
| [Risk 1] | [High/Med/Low] | [High/Med/Low] | [Plan] | [Name] | [Date] |
| [Risk 2] | [High/Med/Low] | [High/Med/Low] | [Plan] | [Name] | [Date] |
## Decisions Made
- [Decision 1] - Rationale: [Explanation] - Impact: [Description]
- [Decision 2] - Rationale: [Explanation] - Impact: [Description]
## Next Week Focus
- [Priority 1]
- [Priority 2]
- [Priority 3]
## Action Items
- [Action 1] - Owner: [Name] - Due: [Date]
- [Action 2] - Owner: [Name] - Due: [Date]
## Attachments
- [Link to detailed metrics dashboard]
- [Link to sprint burndown chart]
- [Link to risk register]
```
---
## 🎯 Implementation Checklist
### Week 1: Team Formation
- [ ] All team members hired and onboarded
- [ ] Role responsibilities communicated and accepted
- [ ] Communication tools set up (Slack, Gitea, etc.)
- [ ] Development environment access provided
- [ ] First sprint planning meeting scheduled
### Week 2: Process Implementation
- [ ] Daily standup schedule established
- [ ] Sprint planning process implemented
- [ ] Architecture review process started
- [ ] Quality standards documented and communicated
- [ ] Documentation standards established
### Week 3: Team Optimization
- [ ] First retrospective completed with improvements
- [ ] Cross-team communication protocols refined
- [ ] Performance metrics baseline established
- [ ] Escalation procedures tested and refined
- [ ] Team health survey implemented
### Ongoing: Continuous Improvement
- [ ] Weekly retrospectives with action items
- [ ] Monthly team satisfaction surveys
- [ ] Quarterly process review and optimization
- [ ] Continuous metrics monitoring and improvement
- [ ] Regular team building and knowledge sharing
---
*This team management guide provides the foundation for successful JobForge MVP development with clear roles, processes, and standards for professional team coordination and delivery.*

700
docs/testing_strategy.md Normal file
View File

@@ -0,0 +1,700 @@
# JobForge MVP - Testing Strategy & Guidelines
**Version:** 1.0.0 MVP
**Target Audience:** Development Team
**Testing Framework:** pytest + manual testing
**Last Updated:** July 2025
---
## 🎯 Testing Philosophy
### MVP Testing Approach
- **Pragmatic over Perfect:** Focus on critical path testing rather than 100% coverage
- **Backend Heavy:** Comprehensive API testing, lighter frontend testing for MVP
- **Manual Integration:** Manual testing of full user workflows
- **AI Mocking:** Mock external AI services for reliable testing
- **Database Testing:** Test data isolation and security policies
### Testing Pyramid for MVP
```
┌─────────────────┐
│ Manual E2E │ ← Full user workflows
│ Testing │
├─────────────────┤
│ Integration │ ← API endpoints + database
│ Tests │
├─────────────────┤
│ Unit Tests │ ← Business logic + utilities
│ │
└─────────────────┘
```
---
## 🧪 Unit Testing (Backend)
### Test Structure
```
tests/
├── unit/
│ ├── services/
│ │ ├── test_auth_service.py
│ │ ├── test_application_service.py
│ │ └── test_document_service.py
│ ├── agents/
│ │ ├── test_research_agent.py
│ │ ├── test_resume_optimizer.py
│ │ └── test_cover_letter_generator.py
│ └── helpers/
│ ├── test_validators.py
│ └── test_formatters.py
├── integration/
│ ├── test_api_auth.py
│ ├── test_api_applications.py
│ ├── test_api_documents.py
│ └── test_database_policies.py
├── fixtures/
│ ├── test_data.py
│ └── mock_responses.py
├── conftest.py
└── pytest.ini
```
### Sample Unit Tests
#### Authentication Service Test
```python
# tests/unit/services/test_auth_service.py
import pytest
from unittest.mock import AsyncMock, patch
from src.backend.services.auth_service import AuthenticationService
from src.backend.models.requests import RegisterRequest
class TestAuthenticationService:
@pytest.fixture
def auth_service(self, mock_db):
return AuthenticationService(mock_db)
@pytest.mark.asyncio
async def test_register_user_success(self, auth_service):
# Arrange
register_data = RegisterRequest(
email="test@example.com",
password="SecurePass123!",
full_name="Test User"
)
# Act
user = await auth_service.register_user(register_data)
# Assert
assert user.email == "test@example.com"
assert user.full_name == "Test User"
assert user.id is not None
# Password should be hashed
assert user.password_hash != "SecurePass123!"
assert user.password_hash.startswith("$2b$")
@pytest.mark.asyncio
async def test_register_user_duplicate_email(self, auth_service, existing_user):
# Arrange
register_data = RegisterRequest(
email=existing_user.email, # Same email as existing user
password="SecurePass123!",
full_name="Another User"
)
# Act & Assert
with pytest.raises(DuplicateEmailError):
await auth_service.register_user(register_data)
@pytest.mark.asyncio
async def test_authenticate_user_success(self, auth_service, existing_user):
# Act
auth_result = await auth_service.authenticate_user(
existing_user.email,
"correct_password"
)
# Assert
assert auth_result.success is True
assert auth_result.user.id == existing_user.id
assert auth_result.access_token is not None
assert auth_result.token_type == "bearer"
@pytest.mark.asyncio
async def test_authenticate_user_wrong_password(self, auth_service, existing_user):
# Act
auth_result = await auth_service.authenticate_user(
existing_user.email,
"wrong_password"
)
# Assert
assert auth_result.success is False
assert auth_result.user is None
assert auth_result.access_token is None
```
#### AI Agent Test with Mocking
```python
# tests/unit/agents/test_research_agent.py
import pytest
from unittest.mock import AsyncMock, patch
from src.agents.research_agent import ResearchAgent
class TestResearchAgent:
@pytest.fixture
def research_agent(self, mock_claude_client):
return ResearchAgent(mock_claude_client)
@pytest.mark.asyncio
@patch('src.agents.research_agent.web_search')
async def test_analyze_job_description(self, mock_web_search, research_agent):
# Arrange
job_description = """
We are seeking a Senior Python Developer with 5+ years experience.
Must have Django, PostgreSQL, and AWS experience.
"""
mock_claude_response = {
"content": [{
"text": """
{
"required_skills": ["Python", "Django", "PostgreSQL", "AWS"],
"experience_level": "Senior (5+ years)",
"key_requirements": ["Backend development", "Database design"],
"nice_to_have": ["Docker", "Kubernetes"]
}
"""
}]
}
research_agent.claude_client.messages.create.return_value = mock_claude_response
# Act
analysis = await research_agent.analyze_job_description(job_description)
# Assert
assert "Python" in analysis.required_skills
assert "Django" in analysis.required_skills
assert analysis.experience_level == "Senior (5+ years)"
assert len(analysis.key_requirements) > 0
@pytest.mark.asyncio
async def test_research_company_info(self, research_agent):
# Test company research with mocked web search
company_name = "Google"
# Mock web search results
with patch('src.agents.research_agent.web_search') as mock_search:
mock_search.return_value = {
"results": [
{
"title": "Google - About",
"content": "Google is a multinational technology company...",
"url": "https://about.google.com"
}
]
}
company_info = await research_agent.research_company_info(company_name)
assert company_info.company_name == "Google"
assert len(company_info.recent_news) >= 0
assert company_info.company_description is not None
```
---
## 🔗 Integration Testing
### API Integration Tests
```python
# tests/integration/test_api_applications.py
import pytest
from httpx import AsyncClient
from src.backend.main import app
class TestApplicationsAPI:
@pytest.mark.asyncio
async def test_create_application_success(self, auth_headers):
async with AsyncClient(app=app, base_url="http://test") as client:
# Arrange
application_data = {
"company_name": "Google",
"role_title": "Senior Developer",
"job_description": "We are seeking an experienced developer with Python skills...",
"location": "Toronto, ON",
"priority_level": "high"
}
# Act
response = await client.post(
"/api/v1/applications",
json=application_data,
headers=auth_headers
)
# Assert
assert response.status_code == 201
data = response.json()
assert data["company_name"] == "Google"
assert data["role_title"] == "Senior Developer"
assert data["status"] == "draft"
assert data["name"] == "google_senior_developer_2025_07_01" # Auto-generated
@pytest.mark.asyncio
async def test_create_application_validation_error(self, auth_headers):
async with AsyncClient(app=app, base_url="http://test") as client:
# Arrange - missing required fields
application_data = {
"company_name": "", # Empty company name
"job_description": "Short" # Too short (min 50 chars)
}
# Act
response = await client.post(
"/api/v1/applications",
json=application_data,
headers=auth_headers
)
# Assert
assert response.status_code == 400
error = response.json()
assert "company_name" in error["error"]["details"]
assert "job_description" in error["error"]["details"]
@pytest.mark.asyncio
async def test_list_applications_user_isolation(self, auth_headers, other_user_auth_headers):
async with AsyncClient(app=app, base_url="http://test") as client:
# Create application as user 1
await client.post(
"/api/v1/applications",
json={
"company_name": "User1 Company",
"role_title": "Developer",
"job_description": "Job description for user 1 application..."
},
headers=auth_headers
)
# List applications as user 2
response = await client.get(
"/api/v1/applications",
headers=other_user_auth_headers
)
# Assert user 2 cannot see user 1's applications
assert response.status_code == 200
data = response.json()
assert len(data["applications"]) == 0 # Should be empty for user 2
```
### Database Policy Tests
```python
# tests/integration/test_database_policies.py
import pytest
from src.backend.database.connection import get_db_connection
class TestDatabasePolicies:
@pytest.mark.asyncio
async def test_rls_user_isolation(self, test_user_1, test_user_2):
async with get_db_connection() as conn:
# Set context as user 1
await conn.execute(
"SET LOCAL app.current_user_id = %s",
str(test_user_1.id)
)
# Create application as user 1
result = await conn.execute("""
INSERT INTO applications (user_id, name, company_name, role_title, job_description)
VALUES (%s, 'test_app', 'Test Co', 'Developer', 'Test job description...')
RETURNING id
""", str(test_user_1.id))
app_id = result.fetchone()[0]
# Switch context to user 2
await conn.execute(
"SET LOCAL app.current_user_id = %s",
str(test_user_2.id)
)
# Try to access user 1's application as user 2
result = await conn.execute(
"SELECT * FROM applications WHERE id = %s",
str(app_id)
)
# Assert user 2 cannot see user 1's application
assert len(result.fetchall()) == 0
@pytest.mark.asyncio
async def test_document_cascade_delete(self, test_user, test_application):
async with get_db_connection() as conn:
# Set user context
await conn.execute(
"SET LOCAL app.current_user_id = %s",
str(test_user.id)
)
# Create document
await conn.execute("""
INSERT INTO documents (application_id, document_type, content)
VALUES (%s, 'research_report', 'Test research content')
""", str(test_application.id))
# Delete application
await conn.execute(
"DELETE FROM applications WHERE id = %s",
str(test_application.id)
)
# Verify documents were cascaded deleted
result = await conn.execute(
"SELECT COUNT(*) FROM documents WHERE application_id = %s",
str(test_application.id)
)
assert result.fetchone()[0] == 0
```
---
## 🎭 Test Fixtures & Mocking
### Pytest Configuration
```python
# conftest.py
import pytest
import asyncio
from unittest.mock import AsyncMock
from src.backend.database.connection import get_db_connection
from src.backend.models.requests import RegisterRequest
@pytest.fixture(scope="session")
def event_loop():
"""Create an instance of the default event loop for the test session."""
loop = asyncio.get_event_loop_policy().new_event_loop()
yield loop
loop.close()
@pytest.fixture
async def test_db():
"""Provide test database connection with cleanup."""
async with get_db_connection() as conn:
# Start transaction
trans = await conn.begin()
yield conn
# Rollback transaction (cleanup)
await trans.rollback()
@pytest.fixture
async def test_user(test_db):
"""Create test user."""
user_data = {
"id": "123e4567-e89b-12d3-a456-426614174000",
"email": "test@example.com",
"password_hash": "$2b$12$LQv3c1yqBWVHxkd0LHAkCOYz6TtxMQJqhN8",
"full_name": "Test User"
}
await test_db.execute("""
INSERT INTO users (id, email, password_hash, full_name)
VALUES (%(id)s, %(email)s, %(password_hash)s, %(full_name)s)
""", user_data)
return User(**user_data)
@pytest.fixture
def auth_headers(test_user):
"""Generate authentication headers for test user."""
token = generate_jwt_token(test_user.id)
return {"Authorization": f"Bearer {token}"}
@pytest.fixture
def mock_claude_client():
"""Mock Claude API client."""
mock = AsyncMock()
mock.messages.create.return_value = {
"content": [{
"text": "Mocked Claude response"
}]
}
return mock
@pytest.fixture
def mock_openai_client():
"""Mock OpenAI API client."""
mock = AsyncMock()
mock.embeddings.create.return_value = {
"data": [{
"embedding": [0.1] * 1536 # Mock 1536-dimensional embedding
}]
}
return mock
```
### Test Data Factory
```python
# tests/fixtures/test_data.py
from datetime import datetime
import uuid
class TestDataFactory:
"""Factory for creating test data objects."""
@staticmethod
def create_user_data(**overrides):
defaults = {
"id": str(uuid.uuid4()),
"email": "user@example.com",
"password_hash": "$2b$12$test_hash",
"full_name": "Test User",
"created_at": datetime.now(),
"updated_at": datetime.now()
}
return {**defaults, **overrides}
@staticmethod
def create_application_data(user_id, **overrides):
defaults = {
"id": str(uuid.uuid4()),
"user_id": user_id,
"name": "test_company_developer_2025_07_01",
"company_name": "Test Company",
"role_title": "Software Developer",
"job_description": "We are seeking a software developer with Python experience...",
"location": "Toronto, ON",
"priority_level": "medium",
"status": "draft",
"research_completed": False,
"resume_optimized": False,
"cover_letter_generated": False,
"created_at": datetime.now(),
"updated_at": datetime.now()
}
return {**defaults, **overrides}
@staticmethod
def create_document_data(application_id, **overrides):
defaults = {
"id": str(uuid.uuid4()),
"application_id": application_id,
"document_type": "research_report",
"content": "# Test Research Report\n\nThis is test content...",
"created_at": datetime.now(),
"updated_at": datetime.now()
}
return {**defaults, **overrides}
```
---
## 🎯 Manual Testing Guidelines
### Critical User Workflows
#### Workflow 1: Complete Application Creation
**Goal:** Test full 3-phase workflow from start to finish
**Steps:**
1. **Registration & Login**
- [ ] Register new account with valid email/password
- [ ] Login with created credentials
- [ ] Verify JWT token is received and stored
2. **Application Creation**
- [ ] Create new application with job description
- [ ] Verify application appears in sidebar
- [ ] Check application status is "draft"
3. **Research Phase**
- [ ] Click "Research" tab
- [ ] Verify research processing starts automatically
- [ ] Wait for completion (check processing status)
- [ ] Review generated research report
- [ ] Verify application status updates to "research_complete"
4. **Resume Optimization**
- [ ] Upload at least one resume to library
- [ ] Click "Resume" tab
- [ ] Start resume optimization
- [ ] Verify processing completes successfully
- [ ] Review optimized resume content
- [ ] Test editing resume content
- [ ] Verify application status updates to "resume_ready"
5. **Cover Letter Generation**
- [ ] Click "Cover Letter" tab
- [ ] Add additional context in text box
- [ ] Generate cover letter
- [ ] Review generated content
- [ ] Test editing cover letter
- [ ] Verify application status updates to "cover_letter_ready"
**Expected Results:**
- All phases complete without errors
- Documents are editable and changes persist
- Status updates correctly through workflow
- Navigation works smoothly between phases
#### Workflow 2: Data Isolation Testing
**Goal:** Verify users cannot access each other's data
**Steps:**
1. **Create two test accounts**
- Account A: user1@test.com
- Account B: user2@test.com
2. **Create applications in both accounts**
- Login as User A, create "Google Developer" application
- Login as User B, create "Microsoft Engineer" application
3. **Verify isolation**
- [ ] User A cannot see User B's applications in sidebar
- [ ] User A cannot access User B's application URLs directly
- [ ] Document URLs return 404 for wrong user
#### Workflow 3: Error Handling
**Goal:** Test system behavior with invalid inputs and failures
**Steps:**
1. **Invalid Application Data**
- [ ] Submit empty company name (should show validation error)
- [ ] Submit job description under 50 characters (should fail)
- [ ] Submit invalid URL format (should fail or ignore)
2. **Network/API Failures**
- [ ] Temporarily block Claude API access (mock network failure)
- [ ] Verify user gets meaningful error message
- [ ] Verify system doesn't crash or corrupt data
3. **Authentication Failures**
- [ ] Try accessing API without token (should get 401)
- [ ] Try with expired token (should redirect to login)
- [ ] Try with malformed token (should get error)
---
## 📊 Test Coverage Goals
### MVP Coverage Targets
- **Backend Services:** 80%+ line coverage
- **API Endpoints:** 100% endpoint coverage (at least smoke tests)
- **Database Models:** 90%+ coverage of business logic
- **Critical Paths:** 100% coverage of main user workflows
- **Error Handling:** 70%+ coverage of error scenarios
### Coverage Exclusions (MVP)
- Frontend components (manual testing only)
- External API integrations (mocked)
- Database migration scripts
- Development utilities
- Logging and monitoring code
---
## 🚀 Testing Commands
### Running Tests
```bash
# Run all tests
pytest
# Run with coverage report
pytest --cov=src --cov-report=html
# Run specific test file
pytest tests/unit/services/test_auth_service.py
# Run tests with specific marker
pytest -m "not slow"
# Run integration tests only
pytest tests/integration/
# Verbose output for debugging
pytest -v -s tests/unit/services/test_auth_service.py::TestAuthenticationService::test_register_user_success
```
### Test Database Setup
```bash
# Reset test database
docker-compose exec postgres psql -U jobforge_user -d jobforge_mvp_test -c "DROP SCHEMA public CASCADE; CREATE SCHEMA public;"
# Run database init for tests
docker-compose exec postgres psql -U jobforge_user -d jobforge_mvp_test -f /docker-entrypoint-initdb.d/init.sql
```
---
## 🐛 Testing Best Practices
### DO's
- ✅ **Test business logic thoroughly** - Focus on services and agents
- ✅ **Mock external dependencies** - Claude API, OpenAI, web scraping
- ✅ **Test user data isolation** - Critical for multi-tenant security
- ✅ **Use descriptive test names** - Should explain what is being tested
- ✅ **Test error conditions** - Not just happy paths
- ✅ **Clean up test data** - Use fixtures and database transactions
### DON'Ts
- ❌ **Don't test external APIs directly** - Too unreliable for CI/CD
- ❌ **Don't ignore database constraints** - Test them explicitly
- ❌ **Don't hardcode test data** - Use factories and fixtures
- ❌ **Don't skip cleanup** - Polluted test data affects other tests
- ❌ **Don't test implementation details** - Test behavior, not internals
### Test Organization
```python
# Good test structure
class TestApplicationService:
"""Test class for ApplicationService business logic."""
def test_create_application_with_valid_data_returns_application(self):
"""Should create and return application when given valid data."""
# Arrange
# Act
# Assert
def test_create_application_with_duplicate_name_raises_error(self):
"""Should raise DuplicateNameError when application name already exists."""
# Arrange
# Act
# Assert
```
---
## 📈 Testing Metrics
### Key Testing Metrics
- **Test Execution Time:** Target < 30 seconds for full suite
- **Test Reliability:** 95%+ pass rate on repeated runs
- **Code Coverage:** 80%+ overall, 90%+ for critical paths
- **Bug Detection:** Tests should catch regressions before deployment
### Performance Testing (Basic)
```python
# Basic performance test example
@pytest.mark.asyncio
async def test_application_creation_performance():
"""Application creation should complete within 2 seconds."""
start_time = time.time()
# Create application
result = await application_service.create_application(test_data)
execution_time = time.time() - start_time
assert execution_time < 2.0, f"Application creation took {execution_time:.2f}s"
```
---
*This testing strategy provides comprehensive coverage for the MVP while remaining practical and maintainable. Focus on backend testing for Phase 1, with more sophisticated frontend testing to be added in Phase 2.*