Package org.apache.poi.hssf.extractor
Class EventBasedExcelExtractor
- java.lang.Object
-
- org.apache.poi.hssf.extractor.EventBasedExcelExtractor
-
- All Implemented Interfaces:
Closeable
,AutoCloseable
,POIOLE2TextExtractor
,POITextExtractor
,ExcelExtractor
public class EventBasedExcelExtractor extends Object implements POIOLE2TextExtractor, ExcelExtractor
A text extractor for Excel files, that is based on the HSSF EventUserModel API. It will typically use less memory thanExcelExtractor
, but may not provide the same richness of formatting. Returns the textual content of the file, suitable for indexing by something like Lucene, but not really intended for display to the user.To turn an excel file into a CSV or similar, then see the XLS2CSVmra example
- See Also:
- XLS2CSVmra
-
-
Constructor Summary
Constructors Constructor Description EventBasedExcelExtractor(DirectoryNode dir)
EventBasedExcelExtractor(POIFSFileSystem fs)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
close()
Allows to free resources of the Extractor as soon as it is not needed any more.DocumentSummaryInformation
getDocSummaryInformation()
Would return the document information metadata for the document, if we supported itPOIDocument
getDocument()
Return the underlying POIDocumentCloseable
getFilesystem()
DirectoryEntry
getRoot()
Return the underlying DirectoryEntry of this document.SummaryInformation
getSummaryInformation()
Would return the summary information metadata for the document, if we supported itString
getText()
Retreives the text contents of the fileboolean
isCloseFilesystem()
void
setCloseFilesystem(boolean doCloseFilesystem)
void
setFormulasNotResults(boolean formulasNotResults)
Should we return the formula itself, and not the result it produces? Default is falsevoid
setIncludeCellComments(boolean includeComments)
Would control the inclusion of cell comments from the document, if we supported itvoid
setIncludeHeadersFooters(boolean includeHeadersFooters)
Would control the inclusion of headers and footers from the document, if we supported itvoid
setIncludeSheetNames(boolean includeSheetNames)
Should sheet names be included? Default is true-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.poi.extractor.POIOLE2TextExtractor
getMetadataTextExtractor
-
-
-
-
Constructor Detail
-
EventBasedExcelExtractor
public EventBasedExcelExtractor(DirectoryNode dir)
-
EventBasedExcelExtractor
public EventBasedExcelExtractor(POIFSFileSystem fs)
-
-
Method Detail
-
getDocSummaryInformation
public DocumentSummaryInformation getDocSummaryInformation()
Would return the document information metadata for the document, if we supported it- Specified by:
getDocSummaryInformation
in interfacePOIOLE2TextExtractor
- Returns:
- The Document Summary Information or null if it could not be read for this document.
-
getSummaryInformation
public SummaryInformation getSummaryInformation()
Would return the summary information metadata for the document, if we supported it- Specified by:
getSummaryInformation
in interfacePOIOLE2TextExtractor
- Returns:
- The Summary information for the document or null if it could not be read for this document.
-
setIncludeCellComments
public void setIncludeCellComments(boolean includeComments)
Would control the inclusion of cell comments from the document, if we supported it- Specified by:
setIncludeCellComments
in interfaceExcelExtractor
- Parameters:
includeComments
-true
if cell comments should be included
-
setIncludeHeadersFooters
public void setIncludeHeadersFooters(boolean includeHeadersFooters)
Would control the inclusion of headers and footers from the document, if we supported it- Specified by:
setIncludeHeadersFooters
in interfaceExcelExtractor
- Parameters:
includeHeadersFooters
-true
if headers and footers should be included
-
setIncludeSheetNames
public void setIncludeSheetNames(boolean includeSheetNames)
Should sheet names be included? Default is true- Specified by:
setIncludeSheetNames
in interfaceExcelExtractor
- Parameters:
includeSheetNames
-true
if the sheet names should be included
-
setFormulasNotResults
public void setFormulasNotResults(boolean formulasNotResults)
Should we return the formula itself, and not the result it produces? Default is false- Specified by:
setFormulasNotResults
in interfaceExcelExtractor
- Parameters:
formulasNotResults
-true
if the formula itself is returned
-
getText
public String getText()
Retreives the text contents of the file- Specified by:
getText
in interfaceExcelExtractor
- Specified by:
getText
in interfacePOITextExtractor
- Returns:
- All the text from the document
-
setCloseFilesystem
public void setCloseFilesystem(boolean doCloseFilesystem)
- Specified by:
setCloseFilesystem
in interfacePOITextExtractor
- Parameters:
doCloseFilesystem
-true
(default), if underlying resources/filesystem should be closed onPOITextExtractor.close()
-
isCloseFilesystem
public boolean isCloseFilesystem()
- Specified by:
isCloseFilesystem
in interfacePOITextExtractor
- Returns:
true
, if resources/filesystem should be closed onPOITextExtractor.close()
-
getFilesystem
public Closeable getFilesystem()
- Specified by:
getFilesystem
in interfacePOITextExtractor
- Returns:
- The underlying resources/filesystem
-
getDocument
public POIDocument getDocument()
Description copied from interface:POIOLE2TextExtractor
Return the underlying POIDocument- Specified by:
getDocument
in interfacePOIOLE2TextExtractor
- Specified by:
getDocument
in interfacePOITextExtractor
- Returns:
- the underlying POIDocument
-
getRoot
public DirectoryEntry getRoot()
Description copied from interface:POIOLE2TextExtractor
Return the underlying DirectoryEntry of this document.- Specified by:
getRoot
in interfacePOIOLE2TextExtractor
- Returns:
- the DirectoryEntry that is associated with the POIDocument of this extractor.
-
close
public void close() throws IOException
Description copied from interface:POITextExtractor
Allows to free resources of the Extractor as soon as it is not needed any more. This may include closing open file handles and freeing memory. The Extractor cannot be used after close has been called.- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceCloseable
- Specified by:
close
in interfacePOITextExtractor
- Throws:
IOException
-
-