org.apache.nutch.searcher
Class NutchBean

java.lang.Object
  extended by org.apache.nutch.searcher.NutchBean
All Implemented Interfaces:
Closeable, VersionedProtocol, HitContent, HitDetailer, HitInlinks, HitSummarizer, RPCSearchBean, RPCSegmentBean, SearchBean, Searcher, SegmentBean

public class NutchBean
extends Object
implements SearchBean, RPCSearchBean, SegmentBean, RPCSegmentBean, HitInlinks, Closeable

One stop shopping for search-related functionality.

Version:
$Id: NutchBean.java 925179 2010-03-19 11:34:33Z ab $

Nested Class Summary
static class NutchBean.NutchBeanConstructor
          Responsible for constructing a NutchBean singleton instance and caching it in the servlet context.
 
Field Summary
static String KEY
           
static org.apache.commons.logging.Log LOG
           
 
Constructor Summary
NutchBean(Configuration conf)
           
NutchBean(Configuration conf, Path dir)
          Construct in a named directory.
 
Method Summary
 void close()
           
static NutchBean get(javax.servlet.ServletContext app, Configuration conf)
          Returns the cached instance in the servlet context.
 String[] getAnchors(HitDetails hit)
          Returns the anchors of a hit document.
 byte[] getContent(HitDetails hit)
          Returns the content of a hit document.
 HitDetails getDetails(Hit hit)
          Returns the details for a hit document.
 HitDetails[] getDetails(Hit[] hits)
          Returns the details for a set of hits.
 String getExplanation(Query query, Hit hit)
          Return an HTML-formatted explanation of how a query scored.
 long getFetchDate(HitDetails hit)
          Returns the fetch date of a hit document.
 Inlinks getInlinks(HitDetails hit)
          Return the inlinks of a hit document.
 ParseData getParseData(HitDetails hit)
          Returns the ParseData of a hit document.
 ParseText getParseText(HitDetails hit)
          Returns the ParseText of a hit document.
 long getProtocolVersion(String className, long clientVersion)
           
 String[] getSegmentNames()
           
 Summary[] getSummary(HitDetails[] hits, Query query)
          Returns summaries for a set of details.
 Summary getSummary(HitDetails hit, Query query)
          Returns a summary for the given hit details.
static void main(String[] args)
          For debugging.
 boolean ping()
           
static List<InetSocketAddress> readAddresses(Path path, Configuration conf)
           
static List<String> readConfig(Path path, Configuration conf)
           
 Hits search(Query query)
          Return the top-scoring hits for a query.
 Hits search(Query query, int numHits)
          Deprecated. since 1.1, use search(Query) instead
 Hits search(Query query, int numHits, int maxHitsPerDup)
          Deprecated. since 1.1, use search(Query) instead
 Hits search(Query query, int numHits, int maxHitsPerDup, String dedupField)
          Deprecated. since 1.1, use search(Query) instead
 Hits search(Query query, int numHits, int maxHitsPerDup, String dedupField, String sortField, boolean reverse)
          Deprecated. since 1.1, use search(Query) instead
 Hits search(Query query, int numHits, String dedupField, String sortField, boolean reverse)
          Deprecated. since 1.1, use search(Query) instead
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final org.apache.commons.logging.Log LOG

KEY

public static final String KEY
See Also:
Constant Field Values
Constructor Detail

NutchBean

public NutchBean(Configuration conf)
          throws IOException
Parameters:
conf -
Throws:
IOException

NutchBean

public NutchBean(Configuration conf,
                 Path dir)
          throws IOException
Construct in a named directory.

Parameters:
conf -
dir -
Throws:
IOException
Method Detail

get

public static NutchBean get(javax.servlet.ServletContext app,
                            Configuration conf)
                     throws IOException
Returns the cached instance in the servlet context.

Throws:
IOException
See Also:
NutchBean.NutchBeanConstructor

readAddresses

public static List<InetSocketAddress> readAddresses(Path path,
                                                    Configuration conf)
                                             throws IOException
Throws:
IOException

readConfig

public static List<String> readConfig(Path path,
                                      Configuration conf)
                               throws IOException
Throws:
IOException

getSegmentNames

public String[] getSegmentNames()
                         throws IOException
Specified by:
getSegmentNames in interface SegmentBean
Throws:
IOException

search

public Hits search(Query query,
                   int numHits)
            throws IOException
Deprecated. since 1.1, use search(Query) instead

Throws:
IOException

search

public Hits search(Query query,
                   int numHits,
                   String dedupField,
                   String sortField,
                   boolean reverse)
            throws IOException
Deprecated. since 1.1, use search(Query) instead

Description copied from interface: Searcher
Return the top-scoring hits for a query.

Specified by:
search in interface Searcher
Throws:
IOException

search

public Hits search(Query query)
            throws IOException
Description copied from interface: Searcher
Return the top-scoring hits for a query.

Specified by:
search in interface Searcher
Throws:
IOException

search

public Hits search(Query query,
                   int numHits,
                   int maxHitsPerDup)
            throws IOException
Deprecated. since 1.1, use search(Query) instead

Search for pages matching a query, eliminating excessive hits from the same site. Hits after the first maxHitsPerDup from the same site are removed from results. The remaining hits have Hit.moreFromDupExcluded() set.

If maxHitsPerDup is zero then all hits are returned.

Parameters:
query - query
numHits - number of requested hits
maxHitsPerDup - the maximum hits returned with matching values, or zero
Returns:
Hits the matching hits
Throws:
IOException

search

public Hits search(Query query,
                   int numHits,
                   int maxHitsPerDup,
                   String dedupField)
            throws IOException
Deprecated. since 1.1, use search(Query) instead

Search for pages matching a query, eliminating excessive hits with matching values for a named field. Hits after the first maxHitsPerDup are removed from results. The remaining hits have Hit.moreFromDupExcluded() set.

If maxHitsPerDup is zero then all hits are returned.

Parameters:
query - query
numHits - number of requested hits
maxHitsPerDup - the maximum hits returned with matching values, or zero
dedupField - field name to check for duplicates
Returns:
Hits the matching hits
Throws:
IOException

search

public Hits search(Query query,
                   int numHits,
                   int maxHitsPerDup,
                   String dedupField,
                   String sortField,
                   boolean reverse)
            throws IOException
Deprecated. since 1.1, use search(Query) instead

Search for pages matching a query, eliminating excessive hits with matching values for a named field. Hits after the first maxHitsPerDup are removed from results. The remaining hits have Hit.moreFromDupExcluded() set.

If maxHitsPerDup is zero then all hits are returned.

Parameters:
query - query
numHits - number of requested hits
maxHitsPerDup - the maximum hits returned with matching values, or zero
dedupField - field name to check for duplicates
sortField - Field to sort on (or null if no sorting).
reverse - True if we are to reverse sort by sortField.
Returns:
Hits the matching hits
Throws:
IOException

getExplanation

public String getExplanation(Query query,
                             Hit hit)
                      throws IOException
Description copied from interface: Searcher
Return an HTML-formatted explanation of how a query scored.

Specified by:
getExplanation in interface Searcher
Throws:
IOException

getDetails

public HitDetails getDetails(Hit hit)
                      throws IOException
Description copied from interface: HitDetailer
Returns the details for a hit document.

Specified by:
getDetails in interface HitDetailer
Throws:
IOException

getDetails

public HitDetails[] getDetails(Hit[] hits)
                        throws IOException
Description copied from interface: HitDetailer
Returns the details for a set of hits. Hook for parallel IPC calls.

Specified by:
getDetails in interface HitDetailer
Throws:
IOException

getSummary

public Summary getSummary(HitDetails hit,
                          Query query)
                   throws IOException
Description copied from interface: HitSummarizer
Returns a summary for the given hit details.

Specified by:
getSummary in interface HitSummarizer
Parameters:
hit - the details of the hit to be summarized
query - indicates what should be higlighted in the summary text
Throws:
IOException

getSummary

public Summary[] getSummary(HitDetails[] hits,
                            Query query)
                     throws IOException
Description copied from interface: HitSummarizer
Returns summaries for a set of details. Hook for parallel IPC calls.

Specified by:
getSummary in interface HitSummarizer
Parameters:
hits - the details of hits to be summarized
query - indicates what should be higlighted in the summary text
Throws:
IOException

getContent

public byte[] getContent(HitDetails hit)
                  throws IOException
Description copied from interface: HitContent
Returns the content of a hit document.

Specified by:
getContent in interface HitContent
Throws:
IOException

getParseData

public ParseData getParseData(HitDetails hit)
                       throws IOException
Description copied from interface: HitContent
Returns the ParseData of a hit document.

Specified by:
getParseData in interface HitContent
Throws:
IOException

getParseText

public ParseText getParseText(HitDetails hit)
                       throws IOException
Description copied from interface: HitContent
Returns the ParseText of a hit document.

Specified by:
getParseText in interface HitContent
Throws:
IOException

getAnchors

public String[] getAnchors(HitDetails hit)
                    throws IOException
Description copied from interface: HitInlinks
Returns the anchors of a hit document.

Specified by:
getAnchors in interface HitInlinks
Throws:
IOException

getInlinks

public Inlinks getInlinks(HitDetails hit)
                   throws IOException
Description copied from interface: HitInlinks
Return the inlinks of a hit document.

Specified by:
getInlinks in interface HitInlinks
Throws:
IOException

getFetchDate

public long getFetchDate(HitDetails hit)
                  throws IOException
Description copied from interface: HitContent
Returns the fetch date of a hit document.

Specified by:
getFetchDate in interface HitContent
Throws:
IOException

close

public void close()
           throws IOException
Specified by:
close in interface Closeable
Throws:
IOException

ping

public boolean ping()
Specified by:
ping in interface SearchBean

main

public static void main(String[] args)
                 throws Exception
For debugging.

Throws:
Exception

getProtocolVersion

public long getProtocolVersion(String className,
                               long clientVersion)
                        throws IOException
Specified by:
getProtocolVersion in interface VersionedProtocol
Throws:
IOException


Copyright © 2006 The Apache Software Foundation