Package org.apache.poi.extractor
Interface POITextExtractor
-
- All Superinterfaces:
AutoCloseable
,Closeable
- All Known Subinterfaces:
POIOLE2TextExtractor
- All Known Implementing Classes:
EventBasedExcelExtractor
,ExcelExtractor
,HPSFPropertiesExtractor
,OldExcelExtractor
,SlideShowExtractor
public interface POITextExtractor extends Closeable
Common Parent for Text Extractors of POI Documents. You will typically find the implementation of a given format's text extractor under org.apache.poi.[format].extractor .- See Also:
ExcelExtractor
,org.apache.poi.hdgf.extractor.VisioTextExtractor
,org.apache.poi.hwpf.extractor.WordExtractor
-
-
Method Summary
All Methods Instance Methods Abstract Methods Default Methods Modifier and Type Method Description default void
close()
Allows to free resources of the Extractor as soon as it is not needed any more.Object
getDocument()
Closeable
getFilesystem()
POITextExtractor
getMetadataTextExtractor()
Returns another text extractor, which is able to output the textual content of the document metadata / properties, such as author and title.String
getText()
Retrieves all the text from the document.boolean
isCloseFilesystem()
void
setCloseFilesystem(boolean doCloseFilesystem)
-
-
-
Method Detail
-
getText
String getText()
Retrieves all the text from the document. How cells, paragraphs etc are separated in the text is implementation specific - see the javadocs for a specific project for details.- Returns:
- All the text from the document
-
getMetadataTextExtractor
POITextExtractor getMetadataTextExtractor()
Returns another text extractor, which is able to output the textual content of the document metadata / properties, such as author and title.- Returns:
- the metadata and text extractor
-
setCloseFilesystem
void setCloseFilesystem(boolean doCloseFilesystem)
- Parameters:
doCloseFilesystem
-true
(default), if underlying resources/filesystem should be closed onclose()
-
isCloseFilesystem
boolean isCloseFilesystem()
- Returns:
true
, if resources/filesystem should be closed onclose()
-
getFilesystem
Closeable getFilesystem()
- Returns:
- The underlying resources/filesystem
-
close
default void close() throws IOException
Allows to free resources of the Extractor as soon as it is not needed any more. This may include closing open file handles and freeing memory. The Extractor cannot be used after close has been called.- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceCloseable
- Throws:
IOException
-
getDocument
Object getDocument()
- Returns:
- the processed document
-
-