Package ch.autumo.commons.documents
Interface DocumentExtractor
public interface DocumentExtractor
Extract a document.
-
Field Summary
Fields -
Method Summary
Modifier and TypeMethodDescriptionorg.apache.lucene.document.DocumentExtract document to the lucene fieldsDOCUMENTS_SOURCE_FIELDS
-
Field Details
-
DOCUMENTS_SOURCE_FIELDS
'autumo Documents' module document source fields. -
DOCUMENTS_ALL_FIELDS
'autumo Documents' module document all fields. Additional: file name and file extension before source fields.
-
-
Method Details
-
extract
Extract document to the lucene fieldsDOCUMENTS_SOURCE_FIELDS- Parameters:
file- file- Returns:
- extracted document
- Throws:
Exception- if extraction fails
-