文件名称:4pm

  • 所属分类:
  • 搜索引擎
  • 资源属性:
  • [Java] [源码]
  • 上传时间:
  • 2012-11-26
  • 文件大小:
  • 2.85mb
  • 下载次数:
  • 0次
  • 提 供 者:
  • 曹**
  • 相关连接:
  • 下载说明:
  • 别用迅雷下载,失败请重下,重下不扣分!

介绍说明--下载内容均来自于网络,请自行研究使用

本文用lucene和Heritrix构建了一个Web 搜索应用程序

Lucene 是基于 Java 的全文信息检索包,它目前是 Apache Jakarta 家族下面的一个开源项目。

Lucene很强大,但是,无论多么强大的搜索引擎工具,在其后台,都需要一样东西来支援它,那就是网络爬虫Spider。网络爬虫,又被称为蜘蛛Spider,或是网络机器人、BOT等,这些都无关紧要,最重要的是要认识到,由于爬虫的存在,才使得搜索引擎有了丰富的资源。

Heritrix是一个纯由Java开发的、开源的Web网络爬虫,用户可以使用它从网络上抓取想要的资源。它来自于www.archive.org。Heritrix最出色之处在于它的可扩展性,开发者可以扩展它的各个组件,来实现自己的抓取逻辑。-In this paper, lucene and Heritrix build a Web search application

   Lucene is a Java-based full-text information retrieval package, it is now the Apache Jakarta family, following an open source project.

    Lucene is very powerful, but, no matter how powerful search engine tool, in its background, we need something to support it, that is, Web crawler Spider. Web crawlers, also known as Spider Spider, or robot network, BOT, etc., which are insignificant, the most important thing is to recognize that, due to the presence of reptiles, which makes the search engine there are plenty of resources.

    Heritrix is a pure Java developed by the, open source Web crawler, the user can use it to grab you want from the network resources. It comes from www.archive.org. Heritrix is that it is the best scalability, developers can extend its various components, to achieve their capture logic.


(系统自动生成,下载前可以参看下载内容)

下载文件列表

4pm\.classpath

...\.cvsignore

...\.project

...\.springBeans

...\.tomcatplugin

...\commons\403.jsp

...\.......\404.jsp

...\.......\error.jsp

...\.......\footer.jsp

...\.......\inprogress.jsp

...\.......\messages.jsp

...\.......\meta.jsp

...\.......\taglibs.jsp

...\commons

...\index.jsp

...\META-INF\MANIFEST.MF

...\META-INF

...\query.jsp

...\scripts\builder.js

...\.......\calendar.js

...\.......\controls.js

...\.......\dragdrop.js

...\.......\effects.js

...\.......\global.js

...\.......\login.js

...\.......\prototype.js

...\.......\scriptaculous.js

...\.......\selectbox.js

...\.......\slider.js

...\.......\validator.jsp

...\scripts

...\.tyles\msn\css\blue\FormDefault.css

...\......\...\...\....\ViewDefault.css

...\......\...\...\blue

...\......\...\...\cyan\ViewDefault.css

...\......\...\...\cyan

...\......\...\...\dialog.css

...\......\...\...\edit.css

...\......\...\...\FormDefault.css

...\......\...\...\formFlow.css

...\......\...\...\formForm.css

...\......\...\...\formMain.css

...\......\...\...\formMain1.css

...\......\...\...\formTop.css

...\......\...\...\FormViewDefault.css

...\......\...\...\green\ViewDefault.css

...\......\...\...\green

...\......\...\...\HtmlEdit.css

...\......\...\...\module.left.css

...\......\...\...\newwalterzorn.css

...\......\...\...\systree.css

...\......\...\...\ViewDefault.css

...\......\...\...\xtree.css

...\......\...\css

...\......\...\images\addoption.gif

...\......\...\......\addoption_a.gif

...\......\...\......\archivedo.gif

...\......\...\......\arrow_icon.gif

...\......\...\......\arrow_left.gif

...\......\...\......\arrow_right.gif

...\......\...\......\arrow_yellow.gif

...\......\...\......\assistover.gif

...\......\...\......\attach.gif

...\......\...\......\attachment_btn.gif

...\......\...\......\background\beij_01.jpg

...\......\...\......\..........\bk_01.jpg

...\......\...\......\..........\bk_02.jpg

...\......\...\......\..........\bk_03.jpg

...\......\...\......\..........\bk_04.jpg

...\......\...\......\..........\bk_05.jpg

...\......\...\......\..........\bk_06.jpg

...\......\...\......\..........\bk_07.jpg

...\......\...\......\..........\bk_08.jpg

...\......\...\......\..........\bk_09.jpg

...\......\...\......\..........\btn_bg.gif

...\......\...\......\..........\b_l.gif

...\......\...\......\..........\b_y.gif

...\......\...\......\..........\dongh.gif

...\......\...\......\..........\d_l.gif

...\......\...\......\..........\d_y.gif

...\......\...\......\..........\d_z.gif

...\......\...\......\..........\iframetitle.gif

...\......\...\......\..........\java.jpg

...\......\...\......\..........\shadow_bottom.gif

...\......\...\......\..........\shadow_left.gif

...\......\...\......\..........\shadow_left_bottom.gif

...\......\...\......\..........\shadow_right.gif

...\......\...\......\..........\shadow_right_bottom.gif

...\......\...\......\..........\shadow_top.gif

...\......\...\......\..........\shadow_top_left.gif

...\......\...\......\..........\shadow_top_right.gif

...\......\...\......\..........\tab-active.gif

...\......\...\......\..........\tab-beginer.gif

...\......\...\......\..........\tab-breaker.gif

...\......\...\......\..........\tab-ender.gif

...\......\...\......\..........\tab-expand.gif

...\......\...\......\..........\tab-normal.gif

...\......\...\......\..........\Thumbs.db

...\......\...\......\..........\titlebar.gif

...\......\...\......\..........\t_l.gif

相关说明

  • 本站资源为会员上传分享交流与学习,如有侵犯您的权益,请联系我们删除.
  • 本站是交换下载平台,提供交流渠道,下载内容来自于网络,除下载问题外,其它问题请自行百度更多...
  • 请直接用浏览器下载本站内容,不要使用迅雷之类的下载软件,用WinRAR最新版进行解压.
  • 如果您发现内容无法下载,请稍后再次尝试;或者到消费记录里找到下载记录反馈给我们.
  • 下载后发现下载的内容跟说明不相乎,请到消费记录里找到下载记录反馈给我们,经确认后退回积分.
  • 如下载前有疑问,可以通过点击"提供者"的名字,查看对方的联系方式,联系对方咨询.

相关评论

暂无评论内容.

发表评论

*主  题:
*内  容:
*验 证 码:

源码中国 www.ymcn.org