Uses of Class org.apache.nutch.protocol.Content (apache-nutch 1.8 API)

Packages that use Content
Package	Description
org.apache.nutch.analysis.lang	Text document language identifier.
org.apache.nutch.crawl	Crawl control code.
org.apache.nutch.microformats.reltag	A microformats Rel-Tag Parser/Indexer/Querier plugin.
org.apache.nutch.parse
org.apache.nutch.parse.ext
org.apache.nutch.parse.feed
org.apache.nutch.parse.headings
org.apache.nutch.parse.html	An HTML document parsing plugin.
org.apache.nutch.parse.js
org.apache.nutch.parse.swf
org.apache.nutch.parse.tika
org.apache.nutch.parse.zip
org.apache.nutch.protocol
org.apache.nutch.protocol.file	Protocol plugin which supports retrieving local file resources.
org.apache.nutch.protocol.ftp	Protocol plugin which supports retrieving documents via the ftp protocol.
org.apache.nutch.scoring
org.apache.nutch.scoring.link
org.apache.nutch.scoring.opic
org.apache.nutch.scoring.tld	Top Level Domain Scoring plugin.
org.apache.nutch.scoring.urlmeta	URL Meta Tag Scoring Plugin
org.apache.nutch.segment
org.apache.nutch.util
org.creativecommons.nutch	Sample plugins that parse and index Creative Commons medadata.

Uses of Content in org.apache.nutch.analysis.lang

Methods in org.apache.nutch.analysis.lang with parameters of type Content
Modifier and Type	Method and Description
`ParseResult`	HTMLLanguageParser.`filter(Content content, ParseResult parseResult, HTMLMetaTags metaTags, DocumentFragment doc)` Scan the HTML document looking at possible indications of content language 1.

Uses of Content in org.apache.nutch.crawl

Methods in org.apache.nutch.crawl with parameters of type Content
Modifier and Type	Method and Description
`abstract byte[]`	Signature.`calculate(Content content, Parse parse)`
`byte[]`	MD5Signature.`calculate(Content content, Parse parse)`
`byte[]`	TextProfileSignature.`calculate(Content content, Parse parse)`

Uses of Content in org.apache.nutch.microformats.reltag

Methods in org.apache.nutch.microformats.reltag with parameters of type Content
Modifier and Type	Method and Description
`ParseResult`	RelTagParser.`filter(Content content, ParseResult parseResult, HTMLMetaTags metaTags, DocumentFragment doc)` Scan the HTML document looking at possible rel-tags

Uses of Content in org.apache.nutch.parse

Methods in org.apache.nutch.parse with parameters of type Content
Modifier and Type	Method and Description
`ParseResult`	MetaTagsParser.`filter(Content content, ParseResult parseResult, HTMLMetaTags metaTags, DocumentFragment doc)`
`ParseResult`	HtmlParseFilters.`filter(Content content, ParseResult parseResult, HTMLMetaTags metaTags, DocumentFragment doc)` Run all defined filters.
`ParseResult`	HtmlParseFilter.`filter(Content content, ParseResult parseResult, HTMLMetaTags metaTags, DocumentFragment doc)` Adds metadata or otherwise modifies a parse of HTML content, given the DOM tree of a page.
`ParseResult`	Parser.`getParse(Content c)` This method parses the given content and returns a map of <key, parse> pairs.
`static boolean`	ParseSegment.`isTruncated(Content content)` Checks if the page's content is truncated.
`void`	ParseSegment.`map(org.apache.hadoop.io.WritableComparable<?> key, Content content, org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.Text,ParseImpl> output, org.apache.hadoop.mapred.Reporter reporter)`
`ParseResult`	ParseUtil.`parse(Content content)` Performs a parse by iterating through a List of preferred `Parser`s until a successful parse is performed and a `Parse` object is returned.
`ParseResult`	ParseUtil.`parseByExtensionId(String extId, Content content)` Method parses a `Content` object using the `Parser` specified by the parameter `extId`, i.e., the Parser's extension ID.

Uses of Content in org.apache.nutch.parse.ext

Methods in org.apache.nutch.parse.ext with parameters of type Content
Modifier and Type Method and Description

ParseResult ExtParser.getParse(Content content)

Methods in org.apache.nutch.parse.ext with parameters of type Content
Modifier and Type	Method and Description
`ParseResult`	ExtParser.`getParse(Content content)`

Uses of Content in org.apache.nutch.parse.feed

Methods in org.apache.nutch.parse.feed with parameters of type Content
Modifier and Type	Method and Description
`ParseResult`	FeedParser.`getParse(Content content)` Parses the given feed and extracts out and parsers all linked items within the feed, using the underlying ROME feed parsing library.

Uses of Content in org.apache.nutch.parse.headings

Methods in org.apache.nutch.parse.headings with parameters of type Content
Modifier and Type	Method and Description
`ParseResult`	HeadingsParseFilter.`filter(Content content, ParseResult parseResult, HTMLMetaTags metaTags, DocumentFragment doc)`

Uses of Content in org.apache.nutch.parse.html

Methods in org.apache.nutch.parse.html with parameters of type Content
Modifier and Type Method and Description

ParseResult HtmlParser.getParse(Content content)

Methods in org.apache.nutch.parse.html with parameters of type Content
Modifier and Type	Method and Description
`ParseResult`	HtmlParser.`getParse(Content content)`

Uses of Content in org.apache.nutch.parse.js

Methods in org.apache.nutch.parse.js with parameters of type Content
Modifier and Type	Method and Description
`ParseResult`	JSParseFilter.`filter(Content content, ParseResult parseResult, HTMLMetaTags metaTags, DocumentFragment doc)`
`ParseResult`	JSParseFilter.`getParse(Content c)`

Uses of Content in org.apache.nutch.parse.swf

Methods in org.apache.nutch.parse.swf with parameters of type Content
Modifier and Type Method and Description

ParseResult SWFParser.getParse(Content content)
Uses of Content in org.apache.nutch.parse.tika

Methods in org.apache.nutch.parse.tika with parameters of type Content
Modifier and Type Method and Description

ParseResult TikaParser.getParse(Content content)
Uses of Content in org.apache.nutch.parse.zip

Methods in org.apache.nutch.parse.zip with parameters of type Content
Modifier and Type Method and Description

ParseResult ZipParser.getParse(Content content)

Methods in org.apache.nutch.parse.swf with parameters of type Content
Modifier and Type	Method and Description
`ParseResult`	SWFParser.`getParse(Content content)`

Methods in org.apache.nutch.parse.tika with parameters of type Content
Modifier and Type	Method and Description
`ParseResult`	TikaParser.`getParse(Content content)`

Methods in org.apache.nutch.parse.zip with parameters of type Content
Modifier and Type	Method and Description
`ParseResult`	ZipParser.`getParse(Content content)`

Uses of Content in org.apache.nutch.protocol

Methods in org.apache.nutch.protocol that return Content
Modifier and Type	Method and Description
`Content`	ProtocolOutput.`getContent()`
`static Content`	Content.`read(DataInput in)`

Methods in org.apache.nutch.protocol with parameters of type Content
Modifier and Type	Method and Description
`void`	ProtocolOutput.`setContent(Content content)`

Constructors in org.apache.nutch.protocol with parameters of type Content
Constructor and Description
`ProtocolOutput(Content content)`
`ProtocolOutput(Content content, ProtocolStatus status)`

Uses of Content in org.apache.nutch.protocol.file

Methods in org.apache.nutch.protocol.file that return Content
Modifier and Type Method and Description

Content FileResponse.toContent()
Uses of Content in org.apache.nutch.protocol.ftp

Methods in org.apache.nutch.protocol.ftp that return Content
Modifier and Type Method and Description

Content FtpResponse.toContent()

Methods in org.apache.nutch.protocol.file that return Content
Modifier and Type	Method and Description
`Content`	FileResponse.`toContent()`

Methods in org.apache.nutch.protocol.ftp that return Content
Modifier and Type	Method and Description
`Content`	FtpResponse.`toContent()`

Uses of Content in org.apache.nutch.scoring

Methods in org.apache.nutch.scoring with parameters of type Content
Modifier and Type	Method and Description
`void`	ScoringFilter.`passScoreAfterParsing(org.apache.hadoop.io.Text url, Content content, Parse parse)` Currently a part of score distribution is performed using only data coming from the parsing process.
`void`	AbstractScoringFilter.`passScoreAfterParsing(org.apache.hadoop.io.Text url, Content content, Parse parse)`
`void`	ScoringFilters.`passScoreAfterParsing(org.apache.hadoop.io.Text url, Content content, Parse parse)`
`void`	ScoringFilter.`passScoreBeforeParsing(org.apache.hadoop.io.Text url, CrawlDatum datum, Content content)` This method takes all relevant score information from the current datum (coming from a generated fetchlist) and stores it into `Content` metadata.
`void`	AbstractScoringFilter.`passScoreBeforeParsing(org.apache.hadoop.io.Text url, CrawlDatum datum, Content content)`
`void`	ScoringFilters.`passScoreBeforeParsing(org.apache.hadoop.io.Text url, CrawlDatum datum, Content content)`

Uses of Content in org.apache.nutch.scoring.link

Methods in org.apache.nutch.scoring.link with parameters of type Content
Modifier and Type	Method and Description
`void`	LinkAnalysisScoringFilter.`passScoreAfterParsing(org.apache.hadoop.io.Text url, Content content, Parse parse)`
`void`	LinkAnalysisScoringFilter.`passScoreBeforeParsing(org.apache.hadoop.io.Text url, CrawlDatum datum, Content content)`

Uses of Content in org.apache.nutch.scoring.opic

Methods in org.apache.nutch.scoring.opic with parameters of type Content
Modifier and Type	Method and Description
`void`	OPICScoringFilter.`passScoreAfterParsing(org.apache.hadoop.io.Text url, Content content, Parse parse)` Copy the value from Content metadata under Fetcher.SCORE_KEY to parseData.
`void`	OPICScoringFilter.`passScoreBeforeParsing(org.apache.hadoop.io.Text url, CrawlDatum datum, Content content)` Store a float value of CrawlDatum.getScore() under Fetcher.SCORE_KEY.

Uses of Content in org.apache.nutch.scoring.tld

Methods in org.apache.nutch.scoring.tld with parameters of type Content
Modifier and Type	Method and Description
`void`	TLDScoringFilter.`passScoreAfterParsing(org.apache.hadoop.io.Text url, Content content, Parse parse)`
`void`	TLDScoringFilter.`passScoreBeforeParsing(org.apache.hadoop.io.Text url, CrawlDatum datum, Content content)`

Uses of Content in org.apache.nutch.scoring.urlmeta

Methods in org.apache.nutch.scoring.urlmeta with parameters of type Content
Modifier and Type	Method and Description
`void`	URLMetaScoringFilter.`passScoreAfterParsing(org.apache.hadoop.io.Text url, Content content, Parse parse)` Takes the metadata, which was lumped inside the content, and replicates it within your parse data.
`void`	URLMetaScoringFilter.`passScoreBeforeParsing(org.apache.hadoop.io.Text url, CrawlDatum datum, Content content)` Takes the metadata, specified in your "urlmeta.tags" property, from the datum object and injects it into the content.

Uses of Content in org.apache.nutch.segment

Methods in org.apache.nutch.segment with parameters of type Content
Modifier and Type	Method and Description
`boolean`	SegmentMergeFilter.`filter(org.apache.hadoop.io.Text key, CrawlDatum generateData, CrawlDatum fetchData, CrawlDatum sigData, Content content, ParseData parseData, ParseText parseText, Collection<CrawlDatum> linked)` The filtering method which gets all information being merged for a given key (URL).
`boolean`	SegmentMergeFilters.`filter(org.apache.hadoop.io.Text key, CrawlDatum generateData, CrawlDatum fetchData, CrawlDatum sigData, Content content, ParseData parseData, ParseText parseText, Collection<CrawlDatum> linked)` Iterates over all `SegmentMergeFilter` extensions and if any of them returns false, it will return false as well.

Uses of Content in org.apache.nutch.util

Methods in org.apache.nutch.util with parameters of type Content
Modifier and Type	Method and Description
`void`	EncodingDetector.`autoDetectClues(Content content, boolean filter)`
`String`	EncodingDetector.`guessEncoding(Content content, String defaultValue)` Guess the encoding with the previously specified list of clues.

Uses of Content in org.creativecommons.nutch

Methods in org.creativecommons.nutch with parameters of type Content
Modifier and Type	Method and Description
`ParseResult`	CCParseFilter.`filter(Content content, ParseResult parseResult, HTMLMetaTags metaTags, DocumentFragment doc)` Adds metadata or otherwise modifies a parse of an HTML document, given the DOM tree of a page.

Uses of Classorg.apache.nutch.protocol.Content

Uses of Content in org.apache.nutch.analysis.lang

Uses of Content in org.apache.nutch.crawl

Uses of Content in org.apache.nutch.microformats.reltag

Uses of Content in org.apache.nutch.parse

Uses of Content in org.apache.nutch.parse.ext

Uses of Content in org.apache.nutch.parse.feed

Uses of Content in org.apache.nutch.parse.headings

Uses of Content in org.apache.nutch.parse.html

Uses of Content in org.apache.nutch.parse.js

Uses of Content in org.apache.nutch.parse.swf

Uses of Content in org.apache.nutch.parse.tika

Uses of Content in org.apache.nutch.parse.zip

Uses of Content in org.apache.nutch.protocol

Uses of Content in org.apache.nutch.protocol.file

Uses of Content in org.apache.nutch.protocol.ftp

Uses of Content in org.apache.nutch.scoring

Uses of Content in org.apache.nutch.scoring.link

Uses of Content in org.apache.nutch.scoring.opic

Uses of Content in org.apache.nutch.scoring.tld

Uses of Content in org.apache.nutch.scoring.urlmeta

Uses of Content in org.apache.nutch.segment

Uses of Content in org.apache.nutch.util

Uses of Content in org.creativecommons.nutch

Uses of Class
org.apache.nutch.protocol.Content