搜索资源列表
chentian.fenci
- 实现了基于词库的nutch中文分词,这一部分是其中的dll文件-realized based on the thesaurus nutch Chinese word, this part is one of the dll file
hadoop-0.7.1.tar
- hadoop:Nutch集群平台,分布式编程模式,让Nutch可以自动在普通机器组成的集群中以并行方式分布执行-hadoop : Cluster Nutch software platform, distributed programming model, Let Nutch software can be automatically composed of general machinery cluster parallel to t
BrownRecluse158
- 一个搜索引擎源代码,不同于larbin和nutch的。
nutchkk
- nutch搜索的改进型工具和优化爬虫的相关工具
nutch_recrawl_mergecrawl
- nutch一款开源搜索引擎,recrawl是实现索引更新的脚本 mergecrawl是合并多个网站查询的bash脚本。
nutchtutorial
- nutch turorial,nutch搜索引擎开发文档
nutchxp
- 爬虫数据的改进,并修正了一些bug-Reptiles data to improve and amend a number of bug
nutch0.8
- nutch0.8源码,开源搜索引擎,希望大家从中可以学到很多东西-nutch0.8 source, open source search engine, hope that we can learn a lot from
lucenenutch
- lucene和nutch书中配套代码,这部分为chapter2的内容-lucene 和nutch book package code, this is divided into chapter2 content
vicaya-0.1.6.0
- 基于Nutch 开发的搜索引擎,用于在网上搜索CDRom-Developed based on the Nutch search engine, online search for CDRom
nutch-4.7.x-1.x-dev.tar
- search engines sql server
OReilly.Hadoop.The.Definitive.Guide.June.2009.RETA
- Hadoop got its start in Nutch. A few of us were attempting to build an open source web search engine and having trouble managing computations running on even a handful of computers.-Hadoop got its start in Nutch. A few
lukemin.tar
- lukemin软件:用来查看nutch爬虫抓取的网页的各种信息,清晰全面。-lukemin Software: nutch crawler is used to view web pages crawled all kinds of information, clear and comprehensive.
search
- lucene应用实例程序,包含了建立索引到web搜索的完整代码,里面用到的数据库是dedecms的,可以自己去下载,config.xml为配置文件,需要配置索引目录和链接数据的用户密码。该代码实例可以直接作为你用lucene建立全文搜索的参考-lucene Applications programs, including the establishment of an index to the web search the comple
lucenePnutchPmapreducePsearch-engine
- 三篇关于开源搜索引擎的硕士论文 1、基于Lucene的Web搜索引擎实现 2、基于MapReduce的分布式智能搜索引擎框架研究 3、基于Nutch的垂直搜索引擎的分析与实现-Three open source search engine on the master' s thesis 1, the Web search engine based on Lucene implementation 2, ba
NutchAnalysis
- Nutch中,解决韩语无法解析的问题。文件为.jj文件,需要用JAVACC解析。相信用过NUTCH的人都知道,生成5个文件替换后,重新抓取,然后ant一下,打包新的nutch-1.0.jar,替换到tomcat下就行了。OK-Nutch, solve the problem cannot resolve in Korean. Documents. Jj files, need to use JAVACC analytical. Beli
08214942iobg
- lucene+nutch搜索引擎(lucene开发资料文档,各种功能实例)-lucene development information, features instances
Nutch-Teach
- Nutch搜索引擎架构的学习教程,有需要做爬虫的同学们可以学习下他的理念。-Nutch search engine architecture, tutorials, there is a need to do reptiles students can learn at his ideas.
apache-nutch-1.13-src
- 网络编程一个非常不错的开源网络爬虫学习代码!(windows network open source)