29 KiB
JobForge - Architecture Guide
Version: 1.0.0
Status: Production-Ready Implementation
Date: July 2025
Target Market: Canadian Job Market Applications
Tagline: "Forge Your Path to Success"
📋 Executive Summary
Project Vision
JobForge is a comprehensive, AI-powered job application management system that streamlines the entire application process through intelligent automation, multi-resume optimization, and authentic voice preservation for professional job seekers in the Canadian market.
Core Objectives
- Workflow Automation: 3-phase intelligent application pipeline (Research → Resume → Cover Letter)
- Multi-Resume Intelligence: Leverage multiple resume versions as focused expertise lenses
- Authentic Voice Preservation: Maintain candidate's proven successful writing patterns
- Canadian Market Focus: Optimize for Canadian business culture and application standards
- Local File Management: Complete control over sensitive career documents
- Scalable Architecture: Support for high-volume job application campaigns
Business Value Proposition
- 40% Time Reduction: Automated research and document generation
- Higher Success Rates: Strategic positioning based on comprehensive analysis
- Consistent Quality: Standardized excellence across all applications
- Document Security: Local storage with full user control
- Career Intelligence: Build knowledge base from successful applications
🏗️ High-Level Architecture
System Overview
graph TB
subgraph "User Interface Layer"
A[Streamlit Web UI]
B[Configuration Panel]
C[File Management UI]
D[Workflow Interface]
end
subgraph "Application Core"
E[Application Engine]
F[Phase Orchestrator]
G[State Manager]
H[File Controller]
end
subgraph "AI Processing Layer"
I[Research Agent]
J[Resume Optimizer]
K[Cover Letter Generator]
L[Claude API Client]
end
subgraph "Data Management"
M[Resume Repository]
N[Reference Database]
O[Application Store]
P[Status Tracker]
end
subgraph "Storage Layer"
Q[Local File System]
R[Project Structure]
S[Document Templates]
end
subgraph "External Services"
T[Claude AI API]
U[Web Search APIs]
V[Company Intelligence]
end
A --> E
B --> H
C --> H
D --> F
E --> I
E --> J
E --> K
F --> G
I --> L
J --> L
K --> L
L --> T
M --> Q
N --> Q
O --> Q
P --> Q
H --> R
I --> U
I --> V
Architecture Principles
1. Domain-Driven Design
- Clear separation between job application domain logic and technical infrastructure
- Rich domain models representing real-world career management concepts
- Business rules encapsulated within domain entities
2. Event-Driven Workflow
- Each phase triggers the next through well-defined events
- State transitions logged for auditability and recovery
- Asynchronous processing with real-time UI updates
3. Multi-Source Intelligence
- Resume portfolio treated as complementary expertise views
- Reference database provides voice pattern templates
- Company research aggregated from multiple sources
4. Security-First Design
- All sensitive career data stored locally
- No cloud storage of personal information
- API keys managed through secure environment variables
🔧 Core Components
Application Engine
class JobApplicationEngine:
"""Central orchestrator for the entire application workflow"""
def __init__(self, config: EngineConfig, file_manager: FileManager):
self.config = config
self.file_manager = file_manager
self.phase_orchestrator = PhaseOrchestrator()
self.state_manager = StateManager()
# Core workflow methods
def create_application(self, job_data: JobData) -> Application
def execute_research_phase(self, app_id: str) -> ResearchReport
def optimize_resume(self, app_id: str, research: ResearchReport) -> OptimizedResume
def generate_cover_letter(self, app_id: str, context: ApplicationContext) -> CoverLetter
# Management operations
def list_applications(self, status_filter: str = None) -> List[Application]
def update_application_status(self, app_id: str, status: ApplicationStatus) -> None
def export_application(self, app_id: str, format: ExportFormat) -> str
Responsibilities:
- Coordinate all application lifecycle operations
- Manage state transitions between phases
- Integrate with AI processing agents
- Handle file system operations through delegates
Phase Orchestrator
class PhaseOrchestrator:
"""Manages the 3-phase workflow execution and state transitions"""
class Phases(Enum):
INPUT = "input"
RESEARCH = "research"
RESUME = "resume"
COVER_LETTER = "cover_letter"
COMPLETE = "complete"
def execute_phase(self, phase: Phases, context: PhaseContext) -> PhaseResult
def can_advance_to(self, target_phase: Phases, current_state: ApplicationState) -> bool
def get_phase_requirements(self, phase: Phases) -> List[Requirement]
# Phase-specific execution
async def execute_research(self, job_data: JobData, resume_portfolio: List[Resume]) -> ResearchReport
async def execute_resume_optimization(self, research: ResearchReport, portfolio: ResumePortfolio) -> OptimizedResume
async def execute_cover_letter_generation(self, context: ApplicationContext) -> CoverLetter
Design Features:
- State machine implementation for workflow control
- Async execution with progress callbacks
- Dependency validation between phases
- Rollback capability for failed phases
AI Processing Agents
Research Agent
class ResearchAgent:
"""Phase 1: Comprehensive job description analysis and strategic positioning"""
def __init__(self, claude_client: ClaudeAPIClient, web_search: WebSearchClient):
self.claude = claude_client
self.web_search = web_search
async def analyze_job_description(self, job_desc: str) -> JobAnalysis:
"""Extract and categorize job requirements, company info, and keywords"""
async def assess_candidate_fit(self, job_analysis: JobAnalysis, resume_portfolio: ResumePortfolio) -> FitAssessment:
"""Multi-resume skills assessment with transferability analysis"""
async def research_company_intelligence(self, company_name: str) -> CompanyIntelligence:
"""Gather company culture, recent news, and strategic insights"""
async def generate_strategic_positioning(self, context: ResearchContext) -> StrategicPositioning:
"""Determine optimal candidate positioning and competitive advantages"""
Resume Optimizer
class ResumeOptimizer:
"""Phase 2: Multi-resume synthesis and strategic optimization"""
def __init__(self, claude_client: ClaudeAPIClient, config: OptimizationConfig):
self.claude = claude_client
self.config = config # 600-word limit, formatting rules, etc.
async def synthesize_resume_portfolio(self, portfolio: ResumePortfolio, research: ResearchReport) -> SynthesizedContent:
"""Merge insights from multiple resume versions"""
async def optimize_for_job(self, content: SynthesizedContent, positioning: StrategicPositioning) -> OptimizedResume:
"""Create targeted resume within word limits"""
def validate_optimization(self, resume: OptimizedResume) -> OptimizationReport:
"""Ensure word count, keyword density, and strategic alignment"""
Cover Letter Generator
class CoverLetterGenerator:
"""Phase 3: Authentic voice preservation and company-specific customization"""
def __init__(self, claude_client: ClaudeAPIClient, reference_db: ReferenceDatabase):
self.claude = claude_client
self.reference_db = reference_db
async def analyze_voice_patterns(self, selected_references: List[CoverLetterReference]) -> VoiceProfile:
"""Extract authentic writing style, tone, and structural patterns"""
async def generate_cover_letter(self, context: CoverLetterContext, voice_profile: VoiceProfile) -> CoverLetter:
"""Create authentic cover letter using proven voice patterns"""
def validate_authenticity(self, cover_letter: CoverLetter, voice_profile: VoiceProfile) -> AuthenticityScore:
"""Ensure generated content matches authentic voice patterns"""
Data Models
class Application(BaseModel):
"""Core application entity with full lifecycle management"""
id: str
name: str # company_role_YYYY_MM_DD format
status: ApplicationStatus
created_at: datetime
updated_at: datetime
# Job information
job_data: JobData
company_info: CompanyInfo
# Phase results
research_report: Optional[ResearchReport] = None
optimized_resume: Optional[OptimizedResume] = None
cover_letter: Optional[CoverLetter] = None
# Metadata
priority_level: PriorityLevel
application_deadline: Optional[date] = None
# Business logic
@property
def completion_percentage(self) -> float
def can_advance_to_phase(self, phase: PhaseOrchestrator.Phases) -> bool
def export_to_format(self, format: ExportFormat) -> str
class ResumePortfolio(BaseModel):
"""Collection of focused resume versions representing different expertise areas"""
resumes: List[Resume]
def get_technical_focused(self) -> List[Resume]
def get_management_focused(self) -> List[Resume]
def get_industry_specific(self, industry: str) -> List[Resume]
def synthesize_skills(self) -> SkillMatrix
class JobData(BaseModel):
"""Comprehensive job posting information"""
job_url: Optional[str] = None
job_description: str
company_name: str
role_title: str
location: str
priority_level: PriorityLevel
how_found: str
application_deadline: Optional[date] = None
# Additional context
specific_aspects: Optional[str] = None
company_insights: Optional[str] = None
special_considerations: Optional[str] = None
📊 Data Flow Architecture
Application Creation Flow
sequenceDiagram
participant UI as Streamlit UI
participant Engine as Application Engine
participant FileManager as File Manager
participant Storage as Local Storage
UI->>Engine: create_application(job_data)
Engine->>Engine: validate_job_data()
Engine->>Engine: generate_application_name()
Engine->>FileManager: create_application_folder()
FileManager->>Storage: mkdir(company_role_date)
FileManager->>Storage: save(user_inputs.json)
FileManager->>Storage: save(original_job_description.md)
FileManager->>Storage: save(application_status.json)
Engine-->>UI: Application(id, status=created)
3-Phase Workflow Execution
flowchart TD
A[Application Created] --> B[Phase 1: Research]
B --> C{Research Complete?}
C -->|Yes| D[Phase 2: Resume]
C -->|No| E[Research Error]
D --> F{Resume Complete?}
F -->|Yes| G[Phase 3: Cover Letter]
F -->|No| H[Resume Error]
G --> I{Cover Letter Complete?}
I -->|Yes| J[Application Complete]
I -->|No| K[Cover Letter Error]
E --> L[Log Error & Retry]
H --> L
K --> L
L --> M[Manual Intervention]
subgraph "Phase 1 Details"
B1[Job Analysis]
B2[Multi-Resume Assessment]
B3[Company Research]
B4[Strategic Positioning]
B --> B1 --> B2 --> B3 --> B4 --> C
end
subgraph "Phase 2 Details"
D1[Portfolio Synthesis]
D2[Content Optimization]
D3[Word Count Management]
D4[Strategic Alignment]
D --> D1 --> D2 --> D3 --> D4 --> F
end
subgraph "Phase 3 Details"
G1[Voice Analysis]
G2[Content Generation]
G3[Authenticity Validation]
G4[Company Customization]
G --> G1 --> G2 --> G3 --> G4 --> I
end
File Management Architecture
graph TB
subgraph "Project Root"
A[job-application-engine/]
end
subgraph "User Data"
B[user_data/resumes/]
C[user_data/cover_letter_references/selected/]
D[user_data/cover_letter_references/other/]
end
subgraph "Applications"
E[applications/company_role_date/]
F[├── original_job_description.md]
G[├── research_report.md]
H[├── optimized_resume.md]
I[├── cover_letter.md]
J[├── user_inputs.json]
K[└── application_status.json]
end
subgraph "Configuration"
L[config/]
M[├── engine_config.yaml]
N[├── claude_api_config.json]
O[└── templates/]
end
A --> B
A --> C
A --> D
A --> E
A --> L
E --> F
E --> G
E --> H
E --> I
E --> J
E --> K
🗂️ Project Structure
Directory Layout
job-application-engine/
├── app.py # Streamlit main application
├── requirements.txt # Python dependencies
├── config/
│ ├── engine_config.yaml # Engine configuration
│ ├── claude_api_config.json # API configuration
│ └── templates/ # Document templates
│ ├── research_template.md
│ ├── resume_template.md
│ └── cover_letter_template.md
├── src/ # Source code
│ ├── __init__.py
│ ├── engine/ # Core engine
│ │ ├── __init__.py
│ │ ├── application_engine.py # Main engine class
│ │ ├── phase_orchestrator.py # Workflow management
│ │ └── state_manager.py # State tracking
│ ├── agents/ # AI processing agents
│ │ ├── __init__.py
│ │ ├── research_agent.py # Phase 1: Research
│ │ ├── resume_optimizer.py # Phase 2: Resume
│ │ ├── cover_letter_generator.py # Phase 3: Cover Letter
│ │ └── claude_client.py # Claude API integration
│ ├── models/ # Data models
│ │ ├── __init__.py
│ │ ├── application.py # Application entity
│ │ ├── job_data.py # Job information
│ │ ├── resume.py # Resume models
│ │ └── results.py # Phase results
│ ├── storage/ # Storage management
│ │ ├── __init__.py
│ │ ├── file_manager.py # File operations
│ │ ├── application_store.py # Application persistence
│ │ └── reference_database.py # Cover letter references
│ ├── ui/ # User interface
│ │ ├── __init__.py
│ │ ├── streamlit_app.py # Streamlit components
│ │ ├── workflow_ui.py # Workflow interface
│ │ └── file_management_ui.py # File management
│ └── utils/ # Utilities
│ ├── __init__.py
│ ├── validators.py # Input validation
│ ├── formatters.py # Output formatting
│ └── helpers.py # Helper functions
├── user_data/ # User's career documents
│ ├── resumes/
│ │ ├── resume_complete.md
│ │ ├── resume_technical.md
│ │ └── resume_management.md
│ └── cover_letter_references/
│ ├── selected/ # Tagged as references
│ │ ├── cover_letter_tech.md
│ │ └── cover_letter_consulting.md
│ └── other/ # Available references
│ └── cover_letter_finance.md
├── applications/ # Generated applications
│ ├── dillon_consulting_data_analyst_2025_07_22/
│ └── shopify_senior_developer_2025_07_23/
├── tests/ # Test suite
│ ├── unit/
│ ├── integration/
│ └── fixtures/
├── docs/ # Documentation
│ ├── architecture.md
│ ├── user_guide.md
│ └── api_reference.md
└── scripts/ # Utility scripts
├── setup_project.py
└── backup_applications.py
Module Responsibilities
| Module | Purpose | Key Classes | Dependencies |
|---|---|---|---|
engine/ |
Core workflow orchestration | ApplicationEngine, PhaseOrchestrator |
agents/, models/ |
agents/ |
AI processing logic | ResearchAgent, ResumeOptimizer, CoverLetterGenerator |
models/, utils/ |
models/ |
Data structures and business logic | Application, JobData, Resume, ResumePortfolio |
pydantic |
storage/ |
File system operations | FileManager, ApplicationStore, ReferenceDatabase |
pathlib, json |
ui/ |
User interface components | StreamlitApp, WorkflowUI, FileManagementUI |
streamlit |
utils/ |
Cross-cutting concerns | Validators, Formatters, Helpers |
Various |
🔌 Extensibility Architecture
Plugin System Design
class EnginePlugin(ABC):
"""Base plugin interface for extending engine functionality"""
def before_phase_execution(self, phase: PhaseOrchestrator.Phases, context: PhaseContext) -> PhaseContext:
"""Modify context before phase execution"""
return context
def after_phase_completion(self, phase: PhaseOrchestrator.Phases, result: PhaseResult) -> PhaseResult:
"""Process result after phase completion"""
return result
def on_application_created(self, application: Application) -> None:
"""React to new application creation"""
pass
class MetricsPlugin(EnginePlugin):
"""Collect application performance metrics"""
def after_phase_completion(self, phase: PhaseOrchestrator.Phases, result: PhaseResult) -> PhaseResult:
self.record_phase_metrics(phase, result.execution_time, result.success)
return result
class BackupPlugin(EnginePlugin):
"""Automatic backup of application data"""
def on_application_created(self, application: Application) -> None:
self.backup_application(application)
Configuration System
@dataclass
class EngineConfig:
# Core settings
claude_api_key: str
base_output_directory: str = "./applications"
max_concurrent_phases: int = 1
# AI processing
research_model: str = "claude-sonnet-4-20250514"
resume_word_limit: int = 600
cover_letter_word_range: tuple = (350, 450)
# File management
auto_backup_enabled: bool = True
backup_retention_days: int = 30
# UI preferences
streamlit_theme: str = "light"
show_advanced_options: bool = False
# Extensions
enabled_plugins: List[str] = field(default_factory=list)
@classmethod
def from_file(cls, config_path: str) -> 'EngineConfig':
"""Load configuration from YAML file"""
def validate(self) -> List[ValidationError]:
"""Validate configuration completeness and correctness"""
Multi-Resume Strategy Pattern
class ResumeSelectionStrategy(ABC):
"""Strategy for selecting optimal resume content for specific jobs"""
def select_primary_resume(self, portfolio: ResumePortfolio, job_analysis: JobAnalysis) -> Resume:
"""Select the most relevant primary resume"""
def get_supplementary_content(self, portfolio: ResumePortfolio, primary: Resume) -> List[ResumeSection]:
"""Extract additional content from other resume versions"""
class TechnicalRoleStrategy(ResumeSelectionStrategy):
"""Optimize resume selection for technical positions"""
class ManagementRoleStrategy(ResumeSelectionStrategy):
"""Optimize resume selection for management positions"""
class ConsultingRoleStrategy(ResumeSelectionStrategy):
"""Optimize resume selection for consulting positions"""
🚀 Development Phases
Phase 1: MVP Foundation (Completed)
- ✅ Streamlit UI with file management
- ✅ 3-phase workflow execution
- ✅ Claude API integration
- ✅ Local file storage system
- ✅ Multi-resume processing
- ✅ Cover letter reference system
- ✅ Application status tracking
Phase 2: Enhanced Intelligence (Next)
- 🔄 Advanced company research integration
- 🔄 Improved multi-resume synthesis algorithms
- 🔄 Voice pattern analysis enhancement
- 🔄 Strategic positioning optimization
- 🔄 Application performance analytics
- 🔄 Export functionality (PDF, Word, etc.)
Phase 3: Automation & Scale (Future)
- 📋 Batch application processing
- 📋 Template management system
- 📋 Application campaign planning
- 📋 Success rate tracking and optimization
- 📋 Integration with job boards APIs
- 📋 Automated application submission
Phase 4: Enterprise Features (Future)
- 📋 Multi-user support with role-based access
- 📋 Team collaboration features
- 📋 Advanced analytics and reporting
- 📋 Custom workflow templates
- 📋 Integration with HR systems
- 📋 White-label deployment options
🎯 Technical Specifications
Technology Stack
| Component | Technology | Version | Rationale |
|---|---|---|---|
| UI Framework | Streamlit | 1.28.1 | Rapid prototyping, built-in components, Python-native |
| HTTP Client | requests | 2.31.0 | Reliable, well-documented, synchronous operations |
| Data Validation | Pydantic | 2.0+ | Type safety, automatic validation, great developer experience |
| File Operations | pathlib | Built-in | Modern, object-oriented path handling |
| Configuration | PyYAML | 6.0+ | Human-readable configuration files |
| CLI Future | Click + Rich | Latest | User-friendly CLI with beautiful output |
| Testing | pytest | 7.0+ | Comprehensive testing framework |
| Documentation | MkDocs | 1.5+ | Beautiful, searchable documentation |
Performance Requirements
| Metric | Target | Measurement Method |
|---|---|---|
| Application Creation | <2 seconds | Time from form submission to folder creation |
| Phase 1 Research | <30 seconds | Claude API response + processing time |
| Phase 2 Resume | <20 seconds | Multi-resume synthesis + optimization |
| Phase 3 Cover Letter | <15 seconds | Voice analysis + content generation |
| File Operations | <1 second | Local file read/write operations |
| UI Responsiveness | <500ms | Streamlit component render time |
Quality Standards
Code Quality Metrics
- Type Coverage: 90%+ type hints on all public APIs
- Test Coverage: 85%+ line coverage maintained
- Documentation: All public methods and classes documented
- Code Style: Black formatter + isort + flake8 compliance
- Complexity: Max cyclomatic complexity of 10 per function
Security Requirements
- No API keys hardcoded in source code
- Environment variable management for secrets
- Input sanitization for all user data
- Safe file path handling to prevent directory traversal
- Regular dependency vulnerability scanning
Reliability Standards
- Graceful handling of API failures with user-friendly messages
- Automatic retry logic for transient failures
- Data integrity validation after file operations
- Rollback capability for failed workflow phases
- Comprehensive error logging with context
📈 Monitoring & Analytics
Application Metrics
class ApplicationMetrics:
"""Track application performance and success rates"""
def record_application_created(self, app: Application) -> None
def record_phase_completion(self, app_id: str, phase: PhaseOrchestrator.Phases, duration: float) -> None
def record_application_submitted(self, app_id: str) -> None
def record_application_response(self, app_id: str, response_type: ResponseType) -> None
# Analytics queries
def get_success_rate(self, date_range: DateRange) -> float
def get_average_completion_time(self, phase: PhaseOrchestrator.Phases) -> float
def get_most_effective_strategies(self) -> List[StrategyMetric]
Performance Monitoring
class PerformanceMonitor:
"""Monitor system performance and resource usage"""
def track_api_response_times(self) -> Dict[str, float]
def monitor_file_system_usage(self) -> StorageMetrics
def track_memory_usage(self) -> MemoryMetrics
def generate_performance_report(self) -> PerformanceReport
User Experience Analytics
- Workflow completion rates by phase
- Most common user pain points
- Feature usage statistics
- Error frequency and resolution rates
- Time-to-value metrics
🔒 Security Architecture
Data Protection Strategy
- Local-First: All sensitive career data stored locally
- API Key Management: Secure environment variable handling
- Input Validation: Comprehensive sanitization of all user inputs
- File System Security: Restricted file access patterns
- Audit Trail: Complete logging of all file operations
Privacy Considerations
- No personal data transmitted to third parties (except Claude API for processing)
- User control over all data retention and deletion
- Transparent data usage policies
- Optional anonymization for analytics
🎨 User Experience Design
Design Principles
- Simplicity First: Complex AI power hidden behind simple interfaces
- Progress Transparency: Clear feedback on all processing steps
- Error Recovery: Graceful handling with actionable next steps
- Customization: Flexible configuration without overwhelming options
- Mobile Friendly: Responsive design for various screen sizes
User Journey Optimization
journey
title Job Application Creation Journey
section Setup
Configure folders: 5: User
Upload resumes: 4: User
Tag references: 3: User
section Application
Paste job description: 5: User
Review auto-generated name: 4: User
Start research phase: 5: User
section AI Processing
Wait for research: 3: User, AI
Review research results: 4: User
Approve resume optimization: 5: User, AI
Review cover letter: 5: User, AI
section Completion
Make final edits: 4: User
Export documents: 5: User
Mark as applied: 5: User
📚 Documentation Strategy
Documentation Hierarchy
- Architecture Guide (This Document) - Technical architecture and design decisions
- User Guide - Step-by-step usage instructions with screenshots
- API Reference - Detailed API documentation for extensions
- Developer Guide - Setup, contribution guidelines, and development practices
- Troubleshooting Guide - Common issues and solutions
Documentation Standards
- All public APIs documented with docstrings
- Code examples for all major features
- Screenshots for UI components
- Video tutorials for complex workflows
- Regular documentation updates with each release
🚀 Deployment & Distribution
Distribution Strategy
- GitHub Repository: Open source with comprehensive documentation
- PyPI Package: Easy installation via pip
- Docker Container: Containerized deployment option
- Executable Bundle: Standalone executable for non-technical users
Deployment Options
# Option 1: Direct Python execution
python -m streamlit run app.py
# Option 2: Docker deployment
docker run -p 8501:8501 job-application-engine
# Option 3: Heroku deployment
git push heroku main
# Option 4: Local installation
pip install job-application-engine
job-app-engine --config myconfig.yaml
🔮 Future Enhancements
Advanced AI Features
- Multi-Model Support: Integration with multiple AI providers
- Specialized Models: Domain-specific fine-tuned models
- Continuous Learning: System learns from successful applications
- Predictive Analytics: Success probability estimation
Integration Ecosystem
- LinkedIn Integration: Auto-import job postings and company data
- ATS Integration: Direct submission to Applicant Tracking Systems
- CRM Integration: Track application pipeline in existing CRM
- Calendar Integration: Application deadline management
Enterprise Features
- Multi-Tenant Architecture: Support multiple users/organizations
- Role-Based Access Control: Team collaboration with permission levels
- Workflow Customization: Industry-specific workflow templates
- Advanced Analytics: Success attribution and optimization recommendations
This architecture guide serves as the authoritative reference for the Job Application Engine system design and implementation. For implementation details, see the source code and technical documentation.
For questions or contributions, please refer to the project repository and contribution guidelines.