org.apache.solr.handler.extraction
Class SolrContentHandler
java.lang.Object
org.xml.sax.helpers.DefaultHandler
org.apache.solr.handler.extraction.SolrContentHandler
- All Implemented Interfaces:
- ExtractingParams, ContentHandler, DTDHandler, EntityResolver, ErrorHandler
public class SolrContentHandler
- extends DefaultHandler
- implements ExtractingParams
The class responsible for handling Tika events and translating them into SolrInputDocument
s.
This class is not thread-safe.
User's may wish to override this class to provide their own functionality.
- See Also:
SolrContentHandlerFactory
,
ExtractingRequestHandler
,
ExtractingDocumentLoader
Fields inherited from interface org.apache.solr.handler.extraction.ExtractingParams |
BOOST_PREFIX, CAPTURE_ATTRIBUTES, CAPTURE_ELEMENTS, DEFAULT_FIELD, EXTRACT_FORMAT, EXTRACT_ONLY, LITERALS_PREFIX, LOWERNAMES, MAP_PREFIX, RESOURCE_NAME, STREAM_TYPE, UNKNOWN_FIELD_PREFIX, XPATH_EXPRESSION |
Constructor Summary |
SolrContentHandler(org.apache.tika.metadata.Metadata metadata,
org.apache.solr.common.params.SolrParams params,
org.apache.solr.schema.IndexSchema schema)
|
SolrContentHandler(org.apache.tika.metadata.Metadata metadata,
org.apache.solr.common.params.SolrParams params,
org.apache.solr.schema.IndexSchema schema,
Collection<String> dateFormats)
|
Methods inherited from class org.xml.sax.helpers.DefaultHandler |
endDocument, endPrefixMapping, error, fatalError, ignorableWhitespace, notationDecl, processingInstruction, resolveEntity, setDocumentLocator, skippedEntity, startPrefixMapping, unparsedEntityDecl, warning |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
SolrContentHandler
public SolrContentHandler(org.apache.tika.metadata.Metadata metadata,
org.apache.solr.common.params.SolrParams params,
org.apache.solr.schema.IndexSchema schema)
SolrContentHandler
public SolrContentHandler(org.apache.tika.metadata.Metadata metadata,
org.apache.solr.common.params.SolrParams params,
org.apache.solr.schema.IndexSchema schema,
Collection<String> dateFormats)
newDocument
public org.apache.solr.common.SolrInputDocument newDocument()
- This is called by a consumer when it is ready to deal with a new SolrInputDocument. Overriding
classes can use this hook to add in or change whatever they deem fit for the document at that time.
The base implementation adds the metadata as fields, allowing for potential remapping.
- Returns:
- The
SolrInputDocument
.
startDocument
public void startDocument()
throws SAXException
- Specified by:
startDocument
in interface ContentHandler
- Overrides:
startDocument
in class DefaultHandler
- Throws:
SAXException
startElement
public void startElement(String uri,
String localName,
String qName,
Attributes attributes)
throws SAXException
- Specified by:
startElement
in interface ContentHandler
- Overrides:
startElement
in class DefaultHandler
- Throws:
SAXException
endElement
public void endElement(String uri,
String localName,
String qName)
throws SAXException
- Specified by:
endElement
in interface ContentHandler
- Overrides:
endElement
in class DefaultHandler
- Throws:
SAXException
characters
public void characters(char[] chars,
int offset,
int length)
throws SAXException
- Specified by:
characters
in interface ContentHandler
- Overrides:
characters
in class DefaultHandler
- Throws:
SAXException
transformValue
protected String transformValue(String val,
org.apache.solr.schema.SchemaField schFld)
- Can be used to transform input values based on their
SchemaField
This implementation only formats dates using the DateUtil
.
- Parameters:
val
- The value to transformschFld
- The SchemaField
- Returns:
- The potentially new value.
getBoost
protected float getBoost(String name)
- Get the value of any boost factor for the mapped name.
- Parameters:
name
- The name of the field to see if there is a boost specified
- Returns:
- The boost value
findMappedName
protected String findMappedName(String name)
- Get the name mapping
- Parameters:
name
- The name to check to see if there is a mapping
- Returns:
- The new name, if there is one, else
name
Copyright © 2009 Apache Software Foundation. All Rights Reserved.