Uses of Class
org.apache.nutch.indexer.NutchDocument

Packages that use NutchDocument
org.apache.nutch.analysis.lang Text document language identifier. 
org.apache.nutch.indexer Maintain Lucene full-text indexes. 
org.apache.nutch.indexer.anchor An indexing plugin for inbound anchor text. 
org.apache.nutch.indexer.basic A basic indexing plugin. 
org.apache.nutch.indexer.elastic   
org.apache.nutch.indexer.feed   
org.apache.nutch.indexer.more A more indexing plugin. 
org.apache.nutch.indexer.solr   
org.apache.nutch.indexer.subcollection   
org.apache.nutch.indexer.tld Top Level Domain Indexing plugin. 
org.apache.nutch.microformats.reltag A microformats Rel-Tag Parser/Indexer/Querier plugin. 
org.apache.nutch.scoring   
org.apache.nutch.scoring.link   
org.apache.nutch.scoring.opic   
org.apache.nutch.scoring.tld Top Level Domain Scoring plugin. 
org.creativecommons.nutch Sample plugins that parse and index Creative Commons medadata. 
 

Uses of NutchDocument in org.apache.nutch.analysis.lang
 

Methods in org.apache.nutch.analysis.lang that return NutchDocument
 NutchDocument LanguageIndexingFilter.filter(NutchDocument doc, String url, WebPage page)
           
 

Methods in org.apache.nutch.analysis.lang with parameters of type NutchDocument
 NutchDocument LanguageIndexingFilter.filter(NutchDocument doc, String url, WebPage page)
           
 

Uses of NutchDocument in org.apache.nutch.indexer
 

Methods in org.apache.nutch.indexer that return NutchDocument
 NutchDocument IndexingFilter.filter(NutchDocument doc, String url, WebPage page)
          Adds fields or otherwise modifies the document that will be indexed for a parse.
 NutchDocument IndexingFilters.filter(NutchDocument doc, String url, WebPage page)
          Run all defined filters.
 NutchDocument IndexUtil.index(String key, WebPage page)
          Index a webpage.
 

Methods in org.apache.nutch.indexer that return types with arguments of type NutchDocument
 RecordWriter<String,NutchDocument> IndexerOutputFormat.getRecordWriter(TaskAttemptContext job)
           
 

Methods in org.apache.nutch.indexer with parameters of type NutchDocument
 NutchDocument IndexingFilter.filter(NutchDocument doc, String url, WebPage page)
          Adds fields or otherwise modifies the document that will be indexed for a parse.
 NutchDocument IndexingFilters.filter(NutchDocument doc, String url, WebPage page)
          Run all defined filters.
 void NutchIndexWriter.write(NutchDocument doc)
           
 

Uses of NutchDocument in org.apache.nutch.indexer.anchor
 

Methods in org.apache.nutch.indexer.anchor that return NutchDocument
 NutchDocument AnchorIndexingFilter.filter(NutchDocument doc, String url, WebPage page)
          The AnchorIndexingFilter filter object which supports boolean configuration settings for the deduplication of anchors.
 

Methods in org.apache.nutch.indexer.anchor with parameters of type NutchDocument
 NutchDocument AnchorIndexingFilter.filter(NutchDocument doc, String url, WebPage page)
          The AnchorIndexingFilter filter object which supports boolean configuration settings for the deduplication of anchors.
 

Uses of NutchDocument in org.apache.nutch.indexer.basic
 

Methods in org.apache.nutch.indexer.basic that return NutchDocument
 NutchDocument BasicIndexingFilter.filter(NutchDocument doc, String url, WebPage page)
          The BasicIndexingFilter filter object which supports boolean configurable value for length of characters permitted within the title @see indexer.max.title.length in nutch-default.xml
 

Methods in org.apache.nutch.indexer.basic with parameters of type NutchDocument
 NutchDocument BasicIndexingFilter.filter(NutchDocument doc, String url, WebPage page)
          The BasicIndexingFilter filter object which supports boolean configurable value for length of characters permitted within the title @see indexer.max.title.length in nutch-default.xml
 

Uses of NutchDocument in org.apache.nutch.indexer.elastic
 

Methods in org.apache.nutch.indexer.elastic with parameters of type NutchDocument
 void ElasticWriter.write(NutchDocument doc)
           
 

Uses of NutchDocument in org.apache.nutch.indexer.feed
 

Methods in org.apache.nutch.indexer.feed that return NutchDocument
 NutchDocument FeedIndexingFilter.filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks)
          Extracts out the relevant fields: FEED_AUTHOR FEED_TAGS FEED_PUBLISHED FEED_UPDATED FEED And sends them to the Indexer for indexing within the Nutch index.
 

Methods in org.apache.nutch.indexer.feed with parameters of type NutchDocument
 NutchDocument FeedIndexingFilter.filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks)
          Extracts out the relevant fields: FEED_AUTHOR FEED_TAGS FEED_PUBLISHED FEED_UPDATED FEED And sends them to the Indexer for indexing within the Nutch index.
 

Uses of NutchDocument in org.apache.nutch.indexer.more
 

Methods in org.apache.nutch.indexer.more that return NutchDocument
 NutchDocument MoreIndexingFilter.filter(NutchDocument doc, String url, WebPage page)
           
 

Methods in org.apache.nutch.indexer.more with parameters of type NutchDocument
 NutchDocument MoreIndexingFilter.filter(NutchDocument doc, String url, WebPage page)
           
 

Uses of NutchDocument in org.apache.nutch.indexer.solr
 

Methods in org.apache.nutch.indexer.solr with parameters of type NutchDocument
 void SolrWriter.write(NutchDocument doc)
           
 

Uses of NutchDocument in org.apache.nutch.indexer.subcollection
 

Methods in org.apache.nutch.indexer.subcollection that return NutchDocument
 NutchDocument SubcollectionIndexingFilter.filter(NutchDocument doc, String url, WebPage page)
           
 

Methods in org.apache.nutch.indexer.subcollection with parameters of type NutchDocument
 NutchDocument SubcollectionIndexingFilter.filter(NutchDocument doc, String url, WebPage page)
           
 

Uses of NutchDocument in org.apache.nutch.indexer.tld
 

Methods in org.apache.nutch.indexer.tld that return NutchDocument
 NutchDocument TLDIndexingFilter.filter(NutchDocument doc, String url, WebPage page)
           
 

Methods in org.apache.nutch.indexer.tld with parameters of type NutchDocument
 NutchDocument TLDIndexingFilter.filter(NutchDocument doc, String url, WebPage page)
           
 

Uses of NutchDocument in org.apache.nutch.microformats.reltag
 

Methods in org.apache.nutch.microformats.reltag that return NutchDocument
 NutchDocument RelTagIndexingFilter.filter(NutchDocument doc, String url, WebPage page)
          The RelTagIndexingFilter filter object.
 

Methods in org.apache.nutch.microformats.reltag with parameters of type NutchDocument
 NutchDocument RelTagIndexingFilter.filter(NutchDocument doc, String url, WebPage page)
          The RelTagIndexingFilter filter object.
 

Uses of NutchDocument in org.apache.nutch.scoring
 

Methods in org.apache.nutch.scoring with parameters of type NutchDocument
 float ScoringFilter.indexerScore(String url, NutchDocument doc, WebPage page, float initScore)
          This method calculates a Lucene document boost.
 float ScoringFilters.indexerScore(String url, NutchDocument doc, WebPage row, float initScore)
           
 

Uses of NutchDocument in org.apache.nutch.scoring.link
 

Methods in org.apache.nutch.scoring.link with parameters of type NutchDocument
 float LinkAnalysisScoringFilter.indexerScore(String url, NutchDocument doc, WebPage page, float initScore)
           
 

Uses of NutchDocument in org.apache.nutch.scoring.opic
 

Methods in org.apache.nutch.scoring.opic with parameters of type NutchDocument
 float OPICScoringFilter.indexerScore(String url, NutchDocument doc, WebPage row, float initScore)
          Dampen the boost value by scorePower.
 

Uses of NutchDocument in org.apache.nutch.scoring.tld
 

Methods in org.apache.nutch.scoring.tld with parameters of type NutchDocument
 float TLDScoringFilter.indexerScore(String url, NutchDocument doc, WebPage page, float initScore)
           
 

Uses of NutchDocument in org.creativecommons.nutch
 

Methods in org.creativecommons.nutch that return NutchDocument
 NutchDocument CCIndexingFilter.filter(NutchDocument doc, String url, WebPage page)
           
 

Methods in org.creativecommons.nutch with parameters of type NutchDocument
 void CCIndexingFilter.addUrlFeatures(NutchDocument doc, String urlString)
          Add the features represented by a license URL.
 NutchDocument CCIndexingFilter.filter(NutchDocument doc, String url, WebPage page)
           
 



Copyright © 2012 The Apache Software Foundation