Description
Extends document processing service to support additional file types beyond the original PDF, DOCX, and PPTX.
Solves - #3
Changes
- Added support for:
- Microsoft Office: DOC, XLSX, XLS
- OpenDocument: ODT, ODS, ODP
- Text: TXT, RTF
- E-books: EPUB
- Implemented text extraction handlers for each format
- Added required dependencies
- Updated file type validation
Dependencies Added
- openpyxl, xlrd (Excel)
- odfpy (OpenDocument)
- striprtf (RTF)
- ebooklib, beautifulsoup4 (EPUB)
Demo -
https://drive.google.com/file/d/1lXL6kiJuQFdLX2UrFl48x2zmYYzLza5z/view?usp=sharing
/claim #3