Log of /lucene/nutch/trunk/build.xml
Parent Directory
|
Revision Log
Revision
521933 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Fri Mar 23 22:59:01 2007 UTC
(2 years, 8 months ago)
by
ab
File length: 23181 byte(s)
Diff to
previous 517015
(
colored)
Upgrade to Hadoop 0.12.2 release.
Fix whitespace issues in platform name in bin/hadoop under Cygwin.
Replace deprecated method call.
Revision
495392 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Thu Jan 11 21:51:20 2007 UTC
(2 years, 10 months ago)
by
ab
File length: 22180 byte(s)
Diff to
previous 468672
(
colored)
Upgrade to Hadoop 0.10.1. HTTPClient is now a dependency - move it
to lib/ and remove it as a plugin.
Add also native Linux libraries for Hadoop compression, plus corresponding
logic in bin/nutch.
Hadoop uses larger buffers now - explicitly set large heap size for
JUnit tests. All tests should pass now.
Revision
405165 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Mon May 8 21:04:01 2006 UTC
(3 years, 6 months ago)
by
jerome
File length: 21585 byte(s)
Diff to
previous 397312
(
colored)
NUTCH-134 : Added a summarizer extension point and two enxtensions:
* summary-basic is the current nutch implementation moved into a plugin
* summary-lucene a raw version of a summarizer plugin based on lucene highlighter
Revision
394231 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Sat Apr 15 00:13:21 2006 UTC
(3 years, 7 months ago)
by
jerome
File length: 21433 byte(s)
Diff to
previous 394228
(
colored)
NUTCH-245 : Some minor fixes
- Added Apache License in DTD (?)
- Delete the org/apache/nutch/plugin/doc-files once javadoc task completed.
Revision
394228 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Fri Apr 14 23:57:24 2006 UTC
(3 years, 7 months ago)
by
jerome
File length: 21323 byte(s)
Diff to
previous 392377
(
colored)
NUTCH-245 : Added a DTD for Nutch Plugin Manifest
- Add a commented DTD in src
- Add the DTD in javadoc
- Change the implementation element structure : uses name-value parameters instead of proprietary attributes
- Fix unit tests regarding changes in DTD
- Fix the plugin.xml file in nutch plugins regarding changes in DTD
Revision
387655 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Tue Mar 21 22:35:20 2006 UTC
(3 years, 8 months ago)
by
jerome
File length: 18490 byte(s)
Diff to
previous 382948
(
colored)
Add lib-regex-filter and urlfilter-automaton to the list of javadoc packages.
Add lib-regex-filter and urlfilter-automaton to the list of deployes, tested and cleaned plugins.
Add the regular expression rule file property for urlfilter-automaton.
Revision
376485 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Thu Feb 9 23:20:28 2006 UTC
(3 years, 9 months ago)
by
cutting
File length: 17502 byte(s)
Diff to
previous 376089
(
colored)
Fix for NUTCH-209. Nutch now supplies all code to remote MapReduce daemons through a job jar file. So Hadoop daemons no longer need to be restarted when Nutch code changes.
Revision
376089 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Wed Feb 8 21:48:52 2006 UTC
(3 years, 9 months ago)
by
jerome
File length: 16653 byte(s)
Diff to
previous 376012
(
colored)
NUTCH-139
* Add standard metadata names
* Syntax tolerant metadata names container
* Review usage of metadata among plugins
Revision
189627 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Wed Jun 8 20:07:54 2005 UTC
(4 years, 5 months ago)
by
ab
File length: 15284 byte(s)
Diff to
previous 180146
(
colored)
Add local resources for XSLT tasks. Now the build of web pages can be
completed when offline, and much faster at that.
Patch submitted by Piotr Kosiorowski.
Revision
180146 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Sun Jun 5 20:30:05 2005 UTC
(4 years, 5 months ago)
by
ab
File length: 14955 byte(s)
Diff to
previous 179640
(
colored)
This changes the build process to minimize dependency on Unix/Cygwin
utilities, and on availability of symbolic links.
Patch submitted by Dawid Weiss.
Revision
179436 -
(
view)
(
annotate)
-
[select for diffs]
Modified
Wed Jun 1 22:20:01 2005 UTC
(4 years, 5 months ago)
by
ab
Original Path:
incubator/nutch/trunk/build.xml
File length: 14980 byte(s)
Diff to
previous 161630
(
colored)
This patchset contains improvements to Fetcher, described in NUTCH-54,
specifically the following:
* protocol- and content-based redirection handling in Fetcher.
* parse-js: heuristic link extractor for JavaScript
* protocol-httpclient: HTTP and HTTPS protocol handler, based on
Jakarta Commons HttpClient library.
* alternative HTML parser based on TagSoup.
* improved status reporting for protocol and parse plugins. Status
information is persisted in segment data, so that other plugins can
use it.
* and other assorted fixes...
This work has been sponsored by EvaluMetrix LLC (http://www.evalumetrix.com).
Thank you!
This form allows you to request diffs between any two revisions of this file.
For each of the two "sides" of the diff,
enter a numeric revision.