HTML Office Library: bridge between desktop and web
The first Delphi library for reading all office formats (including PDF) and converting to HTML on the fly.
The HTML Office Library is designed to work with the most popular document formats and convert documents from any source (file, DB, etc) to HTML.
Converted document contains only plain HTML/CSS/SVG and can be displayed using HTML Component library or browser.
Library provides a uniform access to an entire document and its parts, document resources (fonts, images, etc) and properties (title, Table of Contents, etc).
HTML Office Library doesn't depend on any external components (DLLs, OLE, ActiveX, etc) and is cross-platform. Fully written in Delphi and comes with full source code.
Following document formats are supported:
- Rich Text Format (RTF)
- MS Word 6-2007 binary format (DOC)
- MS Word XML document (DOCX)
- MS Power Point binary format (PPT)
- MS Power Point XML format (PPTX)
- MS Excel binary format (XLS)
- MS Excel XML format (XLSX)
- MS Excel XML binary format (XLSB)
- Adobe PDF format (PDF)
- Supercalc format (SXC)
- EPUB (electronic books).
- FB2 (electronic books).
- Markdown.
- Outlook Message (MSG)
- MIME message (.EML)
- Outlook databases (.OST, .PST)
- The Bat! database (.TBB)
- RAR archives
Besides the document conversion classes it also contains the following:
- EMF/WMF to SVG conversion
- TTF to WOFF conversion
- TTF normalization
- TTF to SVG conversion
- CFF to TTF conversion
- Adobe PostScript to TTF conversion.
Supported Delphi versions are: Delphi 7 - Delphi 11.1
Supported platforms: Windows 32/64 VCL and FMX, MacOS, Linux, Android, iOS.
For Delphi 7 - 2007 unicode is fully supported using widestrings
How fast is it? Some measurements:
Document | Convert to HTML with embedded images | Convert to HTML with referenced images | Convert to text |
---|---|---|---|
DOC, 838 pages, 17 Mb. | 437 ms, 20 Mb | 290 ms, 3.4 Mb. | 40 ms, 1.6 Mb. |
DOCX, 41 page, 1 Mb. | 40ms, 1.6 Mb | 40 ms, 306 Kb | 10 ms, 76 Kb |
PDF, 182 pages, 31 Mb | 3500 ms, 75 Mb | 312 ms, 2.7 Mb | 200 ms, 380 Kb |
PPT, 16 slides, 4.8 Mb | 218 ms, 6.8 Mb | 140 ms, 104 Kb | 170 ms, 98 Kb |
XLS, 9000 rows, 7 columns, 2 Mb | 94 ms, 3.5 Mb | 62 ms, 1.2 Mb | |
XLSX, 115000 rows, 95 columns, 44 Mb (320 Mb uncompressed) | 9200 ms, 275 Mb | 6900 ms, 40 Mb |
Search engine test
Sample database | 50 000 documents |
---|---|
Indexing time | 10 min |
Index size | 120 Mb |
Search time | 50-300 ms (depending on number of found documents) |
Dictionary size | 230000 words |
Total words in documents | 21 million |
There are two compiled demos available:
- Simple document viewer: allows to view any document on hard drive using file tree on left side and HtPanel on right.
https://delphihtmlcomponents.com/FileBrowser.zip
To view final HTML press View in browser button. No installation required. - Search Engine demo: create full text search index for documents located in selected folders and find any document from application or Web.
https://delphihtmlcomponents.com/SearchEngine.zip
No installation required. How to use: Run application (SearchEngine.exe). Click Add folder and select folder containing office documents or Outlook PST/OST databases. Click Start indexing, wait until it is completed. Search for documents using one of the following: a) Go to Search tab and enter search query (any words). b) Click Web interface icon and enter search query.
Source code of both applications is included.
Purchase link (Site License) Purchase link (Single Developer License)