|
IBM Information Integrator for Content V8.2 APIs | ||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--com.ibm.mm.sdk.common.infomining.DKIKFDocumentFilter
A document filter can be used to get the textual content of a document.
Field Summary | |
static java.lang.String |
ANSI
7-bit ANSI |
static java.lang.String |
ANSI8
8-bit ANSI |
static java.lang.String |
ASCII
7-bit ASCII |
static java.lang.String |
ASCII8
8-bit ASCII |
static java.lang.String |
CHINESEBIG5
Plain text file uses Chinese Big 5 character set (DBCS). |
static java.lang.String |
CHINESEGB
Plain text file uses Chinese GB character set (DBCS). |
static java.lang.String |
DEFAULT
Default encoding, uses automatic codepage detection and the system codepage as fallback. |
static java.lang.String |
HANGEUL
Plain text file uses Korean Hangul character set (DBCS). |
static java.lang.String |
HTML_CHINESEBIG5
html file encoded in Chinese Big 5 character set |
static java.lang.String |
HTML_CHINESEEUC
html file encoded in Chinese EUC character set |
static java.lang.String |
HTML_CHINESEGB
html file encoded in Chinese GB character set |
static java.lang.String |
HTML_JAPANESEEUC
html file encoded in Japanese EUC character set |
static java.lang.String |
HTML_JAPANESESJIS
html file encoded in Japanese ShiftJIS character set |
static java.lang.String |
HTML_KOREANHANGUL
html file encoded in Korean Hangul character set |
static java.lang.String |
JAPANESE_EUC
Plain text file uses Japanese EUC character set (DBCS). |
static java.lang.String |
SHIFTJIS
Plain text file uses Japanese ShiftJIS character set (DBCS). |
static java.lang.String |
UNICODE
UCS-2 encoded files |
Constructor Summary | |
DKIKFDocumentFilter(DKIKFService ikfService)
Creates a new filter object. |
Method Summary | |
java.util.Map |
getContent(byte[] documentBytes)
Returns a map containing the textual content of the specified document. |
java.util.Map |
getContent(java.io.InputStream in)
Returns a map containing the textual content of the document that is read from the specified stream. |
java.lang.String |
getFilterEncoding()
Returns the currently set encoding. |
DKIKFService |
getService()
Returns the service object used by this filter. |
DKIKFTextDocument |
getTextDocument(byte[] documentBytes)
Returns a text document containing all textual parts of the specified document. |
DKIKFTextDocument |
getTextDocument(java.io.InputStream in)
Returns a text document containing all textual parts of the document that is read from the specified stream. |
void |
setFilterEncoding(java.lang.String encoding)
Sets the character encoding for data retrieval. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
public static java.lang.String SHIFTJIS
public static java.lang.String JAPANESE_EUC
public static java.lang.String CHINESEGB
public static java.lang.String CHINESEBIG5
public static java.lang.String HANGEUL
public static java.lang.String HTML_JAPANESESJIS
public static java.lang.String HTML_JAPANESEEUC
public static java.lang.String HTML_CHINESEBIG5
public static java.lang.String HTML_CHINESEGB
public static java.lang.String HTML_CHINESEEUC
public static java.lang.String HTML_KOREANHANGUL
public static java.lang.String ANSI
public static java.lang.String ANSI8
public static java.lang.String ASCII
public static java.lang.String ASCII8
public static java.lang.String UNICODE
public static java.lang.String DEFAULT
Constructor Detail |
public DKIKFDocumentFilter(DKIKFService ikfService)
ikfService
- the service object to be used by the filter.Method Detail |
public void setFilterEncoding(java.lang.String encoding)
encoding
- codepage to be used by the filter.getFilterEncoding()
public java.lang.String getFilterEncoding()
setFilterEncoding(java.lang.String)
public java.util.Map getContent(byte[] documentBytes) throws java.io.IOException, DKIKFDocumentFilterException
documentBytes
- the complete document as a byte arrayjava.io.IOException
- if an IOException is thrown during document processing.java.io.IOException
- if the filter encounters a problem during document processing.DKIKFAuthorizationException
- if the user or group does not have the privilege IKFRunAnalysisFuncgetContent(InputStream)
,
getTextDocument(InputStream)
,
getTextDocument(byte[])
public java.util.Map getContent(java.io.InputStream in) throws java.io.IOException, DKIKFDocumentFilterException
in
- the input stream to read the documentjava.io.IOException
- if an IOException is thrown during document processing.java.io.IOException
- if the filter encounters a problem during document processing.DKIKFAuthorizationException
- if the user or group does not have the privilege IKFRunAnalysisFuncgetContent(byte[])
,
getTextDocument(InputStream)
,
getTextDocument(byte[])
public DKIKFTextDocument getTextDocument(byte[] documentBytes) throws java.io.IOException, DKIKFDocumentFilterException
documentBytes
- the complete document as a byte arrayjava.io.IOException
- if an IOException is thrown during document processing.java.io.IOException
- if the filter encounters a problem during document processing.DKIKFAuthorizationException
- if the user or group does not have the privilege IKFRunAnalysisFuncgetContent(byte[])
,
getContent(InputStream)
,
getTextDocument(InputStream)
public DKIKFTextDocument getTextDocument(java.io.InputStream in) throws java.io.IOException, DKIKFDocumentFilterException
in
- the input stream to read the documentjava.io.IOException
- if an IOException is thrown during document processing.java.io.IOException
- if the filter encounters a problem during document processing.DKIKFAuthorizationException
- if the user or group does not have the privilege IKFRunAnalysisFuncgetContent(byte[])
,
getContent(InputStream)
,
getTextDocument(byte[])
public DKIKFService getService()
|
IBM Information Integrator for Content V8.2 APIs | ||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |