Package org.apache.poi.extractor
Interface ExtractorProvider
-
- All Known Implementing Classes:
MainExtractorFactory
public interface ExtractorProvider
-
-
Method Summary
All Methods Instance Methods Abstract Methods Default Methods Modifier and Type Method Description boolean
accepts(FileMagic fm)
POITextExtractor
create(File file, String password)
Create Extractor via filePOITextExtractor
create(InputStream inputStream, String password)
Create Extractor via InputStreamPOITextExtractor
create(DirectoryNode poifsDir, String password)
Create Extractor from POIFS nodedefault void
identifyEmbeddedResources(POIOLE2TextExtractor ext, List<Entry> dirs, List<InputStream> nonPOIFS)
Returns an array of text extractors, one for each of the embedded documents in the file (if there are any).
-
-
-
Method Detail
-
accepts
boolean accepts(FileMagic fm)
-
create
POITextExtractor create(File file, String password) throws IOException
Create Extractor via file- Parameters:
file
- the filepassword
- the password ornull
if not encrypted- Returns:
- the extractor
- Throws:
IOException
- if file can't be read or parsed
-
create
POITextExtractor create(InputStream inputStream, String password) throws IOException
Create Extractor via InputStream- Parameters:
inputStream
- the streampassword
- the password ornull
if not encrypted- Returns:
- the extractor
- Throws:
IOException
- if stream can't be read or parsed
-
create
POITextExtractor create(DirectoryNode poifsDir, String password) throws IOException
Create Extractor from POIFS node- Parameters:
poifsDir
- the nodepassword
- the password ornull
if not encrypted- Returns:
- the extractor
- Throws:
IOException
- if node can't be parsedIllegalStateException
- if processing fails for some other reason, e.g. missing JCE Unlimited Strength Jurisdiction Policy files while handling encrypted files.
-
identifyEmbeddedResources
default void identifyEmbeddedResources(POIOLE2TextExtractor ext, List<Entry> dirs, List<InputStream> nonPOIFS) throws IOException
Returns an array of text extractors, one for each of the embedded documents in the file (if there are any). If there are no embedded documents, you'll get back an empty array. Otherwise, you'll get one openPOITextExtractor
for each embedded file.- Parameters:
ext
- the extractor holding the directory to start parsingdirs
- a list to be filled with directory references holding embeddednonPOIFS
- a list to be filled with streams which aren't based on POIFS entries- Throws:
IOException
- when the format specific extraction fails because of invalid entires
-
-