|
Enterprise Information Portal APIs |
||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--com.ibm.gcs.netutil.http.RobotsProcessor
Robots is a robots.txt file processor on a URL This implementation is based on the internet draft of robots.txt available at Robots RFC
Constructor Summary | |
RobotsProcessor(java.io.File fileObj)
creates a RobotsProcesor object given a file object containing a robots.txt description |
|
RobotsProcessor(java.lang.String hostName)
constructor creates a RobotsProcesor object given the host name |
|
RobotsProcessor(java.lang.String hostName,
int port)
constructor creates a RobotsProcesor object given the host name and port number |
|
RobotsProcessor(java.net.URL baseURL)
constructor create a RobotsProcessor object given the URL related to the robots.txt file |
Method Summary | |
java.util.Enumeration |
getAllowedPaths(java.lang.String agent)
returns an array of paths that a particular robot agent is allowed to access |
java.util.Enumeration |
getDisallowedPaths(java.lang.String agent)
returns an array of paths that a particular robot agent is not allowed to access |
java.lang.String |
getHost()
returns the name of the host for which the robots.txt is being processed. |
int |
getPort()
returns the port number where the host is listening for which the robots.txt is being processed. |
boolean |
isAllowed(java.lang.String agent,
java.lang.String path)
given the name of a robot agent and a path check if the agent is allowed to access that path. |
boolean |
isExpired()
A robots processor is expired if "VALID_FOR_MS" ms have passed since the robots processor was constructed. |
com.ibm.gcs.netutil.http.RDFDescription |
toRDFDescription()
produce an RDFDescription of the robot information |
java.lang.String |
toString()
A string representation of the RobotsProcessor data |
Methods inherited from class java.lang.Object |
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Constructor Detail |
public RobotsProcessor(java.lang.String hostName) throws java.net.MalformedURLException, java.io.IOException
hostName
- name of the http hostjava.net.MalformedURLException
- when the URL (http://hostName
/robots.txt) is malformedjava.io.IOException
- when an IO exception occurs while reading the robots.txt dataMalformedURLException
public RobotsProcessor(java.net.URL baseURL) throws java.net.MalformedURLException, IllegalProtocolException, java.io.IOException
baseURL
- URL of the host containing the robots.txt filejava.net.MalformedURLException
- when the URL (url+"/robots.txt") is malformedIllegalProtocolException
- if the protocol is not "http"java.io.IOException
- when an IO exception occurs while reading the robots.txt dataIllegalProtocolException
,
IOException
,
MalformedURLException
public RobotsProcessor(java.lang.String hostName, int port) throws java.net.MalformedURLException, java.io.IOException
hostName
- name of the http hostport
- port in which the host is listeningjava.net.MalformedURLException
- when the URL (http://hostName
:port
/robots.txt) is malformedjava.io.IOException
- when an IO exception occurs while reading the robots.txt dataMalformedURLException
public RobotsProcessor(java.io.File fileObj) throws java.io.IOException
fileObj
- a java.io.File object that contains the robots.txt descriptionFileNotFoundException
- if the file is not foundMethod Detail |
public java.util.Enumeration getDisallowedPaths(java.lang.String agent)
agent
- the agent for whom the test is madepublic java.util.Enumeration getAllowedPaths(java.lang.String agent)
agent
- the agent for whom the test is madepublic boolean isAllowed(java.lang.String agent, java.lang.String path)
agent
- the agent for whom the test is madepath
- the path for which the access is to be checkedpublic java.lang.String getHost()
public int getPort()
public com.ibm.gcs.netutil.http.RDFDescription toRDFDescription()
public java.lang.String toString()
toString
in class java.lang.Object
public boolean isExpired()
|
EIP Web Crawler APIs | ||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |