文件名称:PACHONG
- 所属分类:
- C#编程
- 资源属性:
- [Windows] [Visual.Net] [源码]
- 上传时间:
- 2012-11-26
- 文件大小:
- 780kb
- 下载次数:
- 0次
- 提 供 者:
- 谭*
- 相关连接:
- 无
- 下载说明:
- 别用迅雷下载,失败请重下,重下不扣分!
下载
别用迅雷、360浏览器下载。
如迅雷强制弹出,可右键点击选“另存为”。
失败请重下,重下不扣分。
如迅雷强制弹出,可右键点击选“另存为”。
失败请重下,重下不扣分。
介绍说明--下载内容均来自于网络,请自行研究使用
网络爬虫程序源码
这是一款用 C# 编写的网络爬虫
主要特性有:
可配置:线程数、线程等待时间,连接超时时间,可爬取文件类型和优先级、下载目录等。
状态栏显示统计信息:排入队列URL数,已下载文件数,已下载总字节数,CPU使用率和可用内存等。
有偏好的爬虫:可针对爬取的资源类型设置不同的优先级。
健壮性:十几项URL正规化策略以排除冗余下载、爬虫陷阱避免策略的使用等、多种策略以解析相对路径等。
较好的性能:基于正则表达式的页面解析、适度加锁、维持HTTP连接等。
今后有空可能加入的特性:
新特性 介绍
爬取文件用Berkeley DB存储 提高性能: 常用操作系统不善于处理大量小文件
基于URL Ranking的优先级队列 主题爬虫: 机器学习算法对链接与主题相关度进行评估,并按照得出的优先级顺序进行爬取
爬虫礼仪 遵循爬虫禁止协议、以及避免对服务器资源的过度使用等
性能优化 用UDP取代封装好的HttpWebRequest/Response
DNS缓存
异步的DNS地址解析
硬盘缓存或内存数据库以避免频繁的磁盘寻道
分布式爬虫以扩展单机能力(CPU、内存和硬盘访问) -GreySky source personal accounting system, management of daily accounting classification of report management user management built several sets of beautiful skin for beginners learning to use.相关搜索: 网络爬虫
爬虫
accounting
cpu使用
异步
csharp
分布式
网络爬虫
CSharp
cpu
DNS
这是一款用 C# 编写的网络爬虫
主要特性有:
可配置:线程数、线程等待时间,连接超时时间,可爬取文件类型和优先级、下载目录等。
状态栏显示统计信息:排入队列URL数,已下载文件数,已下载总字节数,CPU使用率和可用内存等。
有偏好的爬虫:可针对爬取的资源类型设置不同的优先级。
健壮性:十几项URL正规化策略以排除冗余下载、爬虫陷阱避免策略的使用等、多种策略以解析相对路径等。
较好的性能:基于正则表达式的页面解析、适度加锁、维持HTTP连接等。
今后有空可能加入的特性:
新特性 介绍
爬取文件用Berkeley DB存储 提高性能: 常用操作系统不善于处理大量小文件
基于URL Ranking的优先级队列 主题爬虫: 机器学习算法对链接与主题相关度进行评估,并按照得出的优先级顺序进行爬取
爬虫礼仪 遵循爬虫禁止协议、以及避免对服务器资源的过度使用等
性能优化 用UDP取代封装好的HttpWebRequest/Response
DNS缓存
异步的DNS地址解析
硬盘缓存或内存数据库以避免频繁的磁盘寻道
分布式爬虫以扩展单机能力(CPU、内存和硬盘访问) -GreySky source personal accounting system, management of daily accounting classification of report management user management built several sets of beautiful skin for beginners learning to use.相关搜索: 网络爬虫
爬虫
accounting
cpu使用
异步
csharp
分布式
网络爬虫
CSharp
cpu
DNS
(系统自动生成,下载前可以参看下载内容)
下载文件列表
NWebCrawler\config.ini
...........\MainForm.cs
...........\MainForm.Designer.cs
...........\MainForm.resx
...........\NWebCrawler.csproj
...........\obj\Debug\NWebCrawler.csproj.FileListAbsolute.txt
...........\...\.....\NWebCrawler.csproj.GenerateResource.Cache
...........\...\.....\NWebCrawler.exe
...........\...\.....\NWebCrawler.MainForm.resources
...........\...\.....\NWebCrawler.pdb
...........\...\.....\NWebCrawler.Properties.Resources.resources
...........\...\.....\NWebCrawler.SettingsForm.resources
...........\...\.....\ResolveAssemblyReference.cache
...........\Program.cs
...........\...perties\AssemblyInfo.cs
...........\..........\Resources.Designer.cs
...........\..........\Resources.resx
...........\..........\Settings.Designer.cs
...........\..........\Settings.settings
...........\SettingsForm.cs
...........\SettingsForm.Designer.cs
...........\SettingsForm.resx
...........Lib\Common\Logger.cs
..............\......\PriorityQueue.cs
..............\CrawleHistroyEntry.cs
..............\CrawlerThread.cs
..............\Downloader.cs
..............\NWebCrawlerLib.csproj
..............\obj\Debug\NWebCrawlerLib.csproj.FileListAbsolute.txt
..............\...\.....\NWebCrawlerLib.exe
..............\...\.....\NWebCrawlerLib.pdb
..............\Parser.cs
..............\Program.cs
..............\...perties\AssemblyInfo.cs
..............\Settings.cs
..............\UrlFrontierQueueManager.cs
..............\Utility.cs
NWebCrawler.sln
NWebCrawler.suo
51aspx源码必读.txt
bin\config.ini
...\download\0003be8238c8302e17c799d9f5d65876.gif
...\........\0718ad68487fa12de0cc75b20f7be03c.html; charset=utf-8
...\........\082e9d970f371da4f6e74dbe2c97f6e2.html; charset=utf-8
...\........\132949602460dfebc35da092329cba0c.gif
...\........\1695505243ceaa9c68e5a00061d1763f.javascript
...\........\1df7133090a0d07c5cec8fccbf6fd8dd.html; charset=utf-8
...\........\203557adfb69f0b4da4e237df2c0899a.html; charset=gb2312
...\........\23e5f50b0b42662c6694e574e74835cd.html; charset=utf-8
...\........\24eebf7019dc355f064372d6a889c60a.html; charset=gb2312
...\........\27439efce81b9ca84182d54aa411418e.html; charset=gb2312
...\........\2a2f02ca86459cde185fc8e8e9045bed.html; charset=utf-8
...\........\349427e49e96cbca35651e55ef94353d.gif
...\........\3891570720e771c847e5ac23e28aa6cc.html
...\........\3ff2932f670fc24203b1290df195dabf.gif
...\........\417d9e708c95da24b75705338598087f.html
...\........\44b19dec343bee7540d2e563399518f6.html; charset=gb2312
...\........\46e1c646c9965ce2581be0e2baa182cf.html; charset=utf-8
...\........\48bfe5c4818bc6d7d0a86b7c5d5a963a.javascript
...\........\4cef95f512517e118d0427cdf40d8d91.javascript
...\........\54cd270476c08dc49137cc587d5420e7.html; charset=utf-8
...\........\5ae7c8b442091b3c740b5f89f2202977.gif
...\........\5f194c03340af2c82af0806b4cd95f44.html; charset=gb2312
...\........\6a78a05748d064e4491b674a391174c7.javascript
...\........\6ba086f85f3602a364dae60f740138c5.html; charset=gb2312
...\........\73e9259e079ac68519bd2cf67af06c13.html; charset=utf-8
...\........\753a67d9417f20f83e1dce17d6146f85.gif
...\........\767223508f1bd57304d84720065f9ee8.x-javascript
...\........\7780c2d0134fad8b7a05a95d0f7b3378.html; charset=gb2312
...\........\7a6721fd05029de13a9df0e2a0948f25.html; charset=UTF-8
...\........\7eedab1d5fa988b034a32f14e08a97c0.gif
...\........\84675a6817fc8715e33bc1c631154b5d.html
...\........\857c3c382495ba1593a316498236e4f8.html; charset=gb2312
...\........\8769fd41800599144d3fffb49173cf71.x-icon
...\........\89253cefeda362f9b403341ccec22420.gif
...\........\8d52d7ccdc272a6bcaf36ae22d856dfc.html; charset=utf-8
...\........\9339d79eed585c1e0b126588c50477a8.javascript
...\........\93c0e58661019bd4a98aa3790a400cdf.x-javascript
...\........\94f1e7adbd48cf364b19771319db6b3f.gif
...\........\956119ce46fe84d5c1e240ef7d417bdb.html; charset=gb2312
...\........\9d71e4ab781e1b9bf3eccf2a47568d6e.html; charset=utf-8
...\........\a2418875c3955a694b18cf795764164a.html; charset=gb2312
...\........\a490c2a29b5986e5cd4e114a0b50d394.html; charset=gb2312
...\........\a6275663cfbb6142241df064c6f249f9.html; charset=gb2312
...........\MainForm.cs
...........\MainForm.Designer.cs
...........\MainForm.resx
...........\NWebCrawler.csproj
...........\obj\Debug\NWebCrawler.csproj.FileListAbsolute.txt
...........\...\.....\NWebCrawler.csproj.GenerateResource.Cache
...........\...\.....\NWebCrawler.exe
...........\...\.....\NWebCrawler.MainForm.resources
...........\...\.....\NWebCrawler.pdb
...........\...\.....\NWebCrawler.Properties.Resources.resources
...........\...\.....\NWebCrawler.SettingsForm.resources
...........\...\.....\ResolveAssemblyReference.cache
...........\Program.cs
...........\...perties\AssemblyInfo.cs
...........\..........\Resources.Designer.cs
...........\..........\Resources.resx
...........\..........\Settings.Designer.cs
...........\..........\Settings.settings
...........\SettingsForm.cs
...........\SettingsForm.Designer.cs
...........\SettingsForm.resx
...........Lib\Common\Logger.cs
..............\......\PriorityQueue.cs
..............\CrawleHistroyEntry.cs
..............\CrawlerThread.cs
..............\Downloader.cs
..............\NWebCrawlerLib.csproj
..............\obj\Debug\NWebCrawlerLib.csproj.FileListAbsolute.txt
..............\...\.....\NWebCrawlerLib.exe
..............\...\.....\NWebCrawlerLib.pdb
..............\Parser.cs
..............\Program.cs
..............\...perties\AssemblyInfo.cs
..............\Settings.cs
..............\UrlFrontierQueueManager.cs
..............\Utility.cs
NWebCrawler.sln
NWebCrawler.suo
51aspx源码必读.txt
bin\config.ini
...\download\0003be8238c8302e17c799d9f5d65876.gif
...\........\0718ad68487fa12de0cc75b20f7be03c.html; charset=utf-8
...\........\082e9d970f371da4f6e74dbe2c97f6e2.html; charset=utf-8
...\........\132949602460dfebc35da092329cba0c.gif
...\........\1695505243ceaa9c68e5a00061d1763f.javascript
...\........\1df7133090a0d07c5cec8fccbf6fd8dd.html; charset=utf-8
...\........\203557adfb69f0b4da4e237df2c0899a.html; charset=gb2312
...\........\23e5f50b0b42662c6694e574e74835cd.html; charset=utf-8
...\........\24eebf7019dc355f064372d6a889c60a.html; charset=gb2312
...\........\27439efce81b9ca84182d54aa411418e.html; charset=gb2312
...\........\2a2f02ca86459cde185fc8e8e9045bed.html; charset=utf-8
...\........\349427e49e96cbca35651e55ef94353d.gif
...\........\3891570720e771c847e5ac23e28aa6cc.html
...\........\3ff2932f670fc24203b1290df195dabf.gif
...\........\417d9e708c95da24b75705338598087f.html
...\........\44b19dec343bee7540d2e563399518f6.html; charset=gb2312
...\........\46e1c646c9965ce2581be0e2baa182cf.html; charset=utf-8
...\........\48bfe5c4818bc6d7d0a86b7c5d5a963a.javascript
...\........\4cef95f512517e118d0427cdf40d8d91.javascript
...\........\54cd270476c08dc49137cc587d5420e7.html; charset=utf-8
...\........\5ae7c8b442091b3c740b5f89f2202977.gif
...\........\5f194c03340af2c82af0806b4cd95f44.html; charset=gb2312
...\........\6a78a05748d064e4491b674a391174c7.javascript
...\........\6ba086f85f3602a364dae60f740138c5.html; charset=gb2312
...\........\73e9259e079ac68519bd2cf67af06c13.html; charset=utf-8
...\........\753a67d9417f20f83e1dce17d6146f85.gif
...\........\767223508f1bd57304d84720065f9ee8.x-javascript
...\........\7780c2d0134fad8b7a05a95d0f7b3378.html; charset=gb2312
...\........\7a6721fd05029de13a9df0e2a0948f25.html; charset=UTF-8
...\........\7eedab1d5fa988b034a32f14e08a97c0.gif
...\........\84675a6817fc8715e33bc1c631154b5d.html
...\........\857c3c382495ba1593a316498236e4f8.html; charset=gb2312
...\........\8769fd41800599144d3fffb49173cf71.x-icon
...\........\89253cefeda362f9b403341ccec22420.gif
...\........\8d52d7ccdc272a6bcaf36ae22d856dfc.html; charset=utf-8
...\........\9339d79eed585c1e0b126588c50477a8.javascript
...\........\93c0e58661019bd4a98aa3790a400cdf.x-javascript
...\........\94f1e7adbd48cf364b19771319db6b3f.gif
...\........\956119ce46fe84d5c1e240ef7d417bdb.html; charset=gb2312
...\........\9d71e4ab781e1b9bf3eccf2a47568d6e.html; charset=utf-8
...\........\a2418875c3955a694b18cf795764164a.html; charset=gb2312
...\........\a490c2a29b5986e5cd4e114a0b50d394.html; charset=gb2312
...\........\a6275663cfbb6142241df064c6f249f9.html; charset=gb2312