Enterprise Information Portal APIs

com.ibm.gcs.db.component
Class DB2VisitedPool

java.lang.Object
  |
  +--com.ibm.gcs.db.component.DB2Pool
        |
        +--com.ibm.gcs.db.component.DB2VisitedPool

public class DB2VisitedPool
extends DB2Pool

DB2VisitedPool represents all the URLs in the database which have already been visited.

These URL records have the following states:
statestate_id
TOBESUMMARIZED2
CRAWLFAILED3
SUMMARYFAILED4
SUMMARIZED5
and satisfy the following SQL query:

 SELECT * 
    FROM urlpoolstable
    WHERE urlpoolstable.state_id>1
 

This class calls getSQLPredicate() in the constructor to build its SQL statements. Extending classes may override getSQLPredicate(), insert(), and contains() to modify the definition of the visited pool.


Fields inherited from class com.ibm.gcs.db.component.DB2Pool
debug
 
Constructor Summary
DB2VisitedPool()
           
 
Method Summary
 boolean contains(DB2URLContainer urlC, Transaction t)
          checks to see if the specified URL has been visited.
 java.lang.String getSQLCount()
          Return a SQL SELECT COUNT(*) statement.
 java.lang.String getSQLSelect()
          Return the SQL SELECT statement.
 void insert(DB2URLContainer urlC)
          updates the state information of the java URL object to reflect a URL in the DB2VisitedPool but does not save this information in the database.
static void main(java.lang.String[] args)
          Simple test.
 
Methods inherited from class com.ibm.gcs.db.component.DB2Pool
getURLContainers, num, toString
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DB2VisitedPool

public DB2VisitedPool()
Method Detail

contains

public boolean contains(DB2URLContainer urlC,
                        Transaction t)
                 throws TransactionException
checks to see if the specified URL has been visited.
Overrides:
contains in class DB2Pool
Parameters:
urlC - The DB2URLContainer to check.
t - The transaction object for DB2 access.
Returns:
true if the URL container has been crawled,false otherwise.
Throws:
TransactionException - on failed SQL execution.

insert

public void insert(DB2URLContainer urlC)
updates the state information of the java URL object to reflect a URL in the DB2VisitedPool but does not save this information in the database. (The URL object must be saved explicitly.)
Overrides:
insert in class DB2Pool
Parameters:
urlC - The DB2URLContainer to check.

getSQLSelect

public java.lang.String getSQLSelect()
Return the SQL SELECT statement. The statement has the form:
      SELECT * 
      FROM urlpoolstable
      WHERE state_id>1
 
Overrides:
getSQLSelect in class DB2Pool
Returns:
the SQL select statement

getSQLCount

public java.lang.String getSQLCount()
Return a SQL SELECT COUNT(*) statement. The statement has the form:
      SELECT COUNT(*) 
      FROM urlpoolstable
      WHERE state_id>1 
 
Overrides:
getSQLCount in class DB2Pool

main

public static void main(java.lang.String[] args)
Simple test.

EIP Web Crawler APIs

(c) Copyright International Business Machines Corporation 1996, 2002. IBM Corp. All rights reserved.