org.apache.solr.analysis
Class HTMLStripCharFilter

java.lang.Object
  extended by java.io.Reader
      extended by org.apache.lucene.analysis.CharStream
          extended by org.apache.lucene.analysis.CharFilter
              extended by org.apache.lucene.analysis.BaseCharFilter
                  extended by org.apache.solr.analysis.HTMLStripCharFilter
All Implemented Interfaces:
Closeable, Readable
Direct Known Subclasses:
HTMLStripReader

public class HTMLStripCharFilter
extends BaseCharFilter

A CharFilter that wraps another Reader and attempts to strip out HTML constructs.

Version:
$Id: HTMLStripCharFilter.java 826299 2009-10-17 19:56:01Z yonik $

Field Summary
static int DEFAULT_READ_AHEAD
           
 
Fields inherited from class org.apache.lucene.analysis.CharFilter
input
 
Fields inherited from class java.io.Reader
lock
 
Constructor Summary
HTMLStripCharFilter(CharStream source)
           
HTMLStripCharFilter(CharStream source, Set<String> escapedTags)
           
HTMLStripCharFilter(CharStream source, Set<String> escapedTags, int readAheadLimit)
           
 
Method Summary
 void close()
           
 int getReadAheadLimit()
           
static void main(String[] args)
           
 int read()
           
 int read(char[] cbuf, int off, int len)
           
 
Methods inherited from class org.apache.lucene.analysis.BaseCharFilter
addOffCorrectMap, correct, getLastCumulativeDiff
 
Methods inherited from class org.apache.lucene.analysis.CharFilter
correctOffset, mark, markSupported, reset
 
Methods inherited from class java.io.Reader
read, read, ready, skip
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_READ_AHEAD

public static final int DEFAULT_READ_AHEAD
See Also:
Constant Field Values
Constructor Detail

HTMLStripCharFilter

public HTMLStripCharFilter(CharStream source)

HTMLStripCharFilter

public HTMLStripCharFilter(CharStream source,
                           Set<String> escapedTags)

HTMLStripCharFilter

public HTMLStripCharFilter(CharStream source,
                           Set<String> escapedTags,
                           int readAheadLimit)
Method Detail

main

public static void main(String[] args)
                 throws IOException
Throws:
IOException

getReadAheadLimit

public int getReadAheadLimit()

read

public int read()
         throws IOException
Overrides:
read in class Reader
Throws:
IOException

read

public int read(char[] cbuf,
                int off,
                int len)
         throws IOException
Overrides:
read in class CharFilter
Throws:
IOException

close

public void close()
           throws IOException
Specified by:
close in interface Closeable
Overrides:
close in class CharFilter
Throws:
IOException


Copyright © 2009 Apache Software Foundation. All Rights Reserved.