Resume Module Design and Implementation
This note records the core design, API responsibilities, async processing pipeline, and practical considerations of the Resume module in the interview-guide project.
Module Capabilities
- Multi-format parsing: supports
PDF,DOCX,DOC,TXT, andMD. - Async processing: uses
Redis Streamfor asynchronous resume analysis with status tracking. - Stability: built-in auto-retry on analysis failure (up to 3 times) + duplicate detection based on file hash.
- Report export: supports one-click export of AI analysis results as a structured PDF report.
Core Status Flow
Key API Design
/api/resumes/upload Upload Resume (Async Analysis)
Rate limit strategy:
- Global limit:
@RateLimit(dimension = RateLimit.Dimension.GLOBAL, count = 5) - IP limit:
@RateLimit(dimension = RateLimit.Dimension.IP, count = 5)
Entry call:
uploadService.uploadAndAnalyze(file);
Processing flow:
- Basic file validation
fileValidationService.validateFile(file, MAX_FILE_SIZE, "Resume");
Includes: null check, file size limit, and logging. 2. File type detection
String contentType = parseService.detectContentType(file);
Supports: PDF, DOCX, DOC, TXT, MD.
3. Duplicate file detection
persistenceService.findExistingResume(file);
Internal flow:
String fileHash = fileHashService.calculateHash(file);
resumeRepository.findByFileHash(fileHash);
- Resume parsing and text cleaning
parseService.parseResume(file);
- Parse to plain text using
Apache Tika textCleaningService.cleanText(content)to reduce excessive line breaks and token usage
- File storage (unstructured data)
storageService.uploadResume(file);
storageService.getFileUrl(fileKey);
Uploads to RustFS/MinIO for unstructured file storage.
6. Metadata persistence
persistenceService.saveResume(file, resumeText, fileKey, fileUrl);
- Send async analysis task
analyzeStreamProducer.sendAnalyzeTask(savedResume.getId(), resumeText);
Uses Redis Stream as the message queue
8. Return upload response
Frontend checks subsequent APIs for async processing status.
/api/resumes Get Resume List
Call chain:
historyService.getAllResumes();
resumePersistenceService.findAllResumes();
Current issue:
- User-level isolation is not implemented yet, so it currently returns the full list.
/api/resumes/{id}/detail Get Resume Detail
Call chain:
historyService.getResumeDetail(id);
resumePersistenceService.findById(id);
resumeRepository.findById(id);
/api/resumes/{id}/export Export Analysis Report as PDF
Call chain:
historyService.exportAnalysisPdf(id);
resumePersistenceService.findById(resumeId);
resumePersistenceService.getLatestAnalysisAsDTO(resumeId);
pdfExportService.exportResumeAnalysis(resume, analysisDTO);
/api/resumes/{id} Delete Resume
Call chain:
deleteService.deleteResume(id);
persistenceService.findById(id);
storageService.deleteResume(resume.getStorageKey());
interviewPersistenceService.deleteSessionsByResumeId(id);
persistenceService.deleteResume(id);
/api/resumes/{id}/reanalyze Reanalyze Resume
Rate limit strategy:
- Global limit:
@RateLimit(dimension = RateLimit.Dimension.GLOBAL, count = 2) - IP limit:
@RateLimit(dimension = RateLimit.Dimension.IP, count = 2)
Call chain:
uploadService.reanalyze(id);
resumeRepository.findById(resumeId);
analyzeStreamProducer.sendAnalyzeTask(resumeId, resumeText);
Then update and persist status in the processing step.
/api/resumes/health Health Check
return Result.success();
For service liveness checks.
Stability Design Points
- Async decoupling: upload and analysis are separated to improve responsiveness.
- Auto-retry: failed analysis retries up to 3 times to reduce transient failures.
- Hash-based dedup:
SHA-256content hash avoids repeated analysis of identical files.
Summary
The Resume module already forms a complete loop: upload, parse, async analyze, export, and delete. The current implementation is stable enough for iterative feature expansion and production hardening.