博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
ganon抓取网页示例
阅读量:4610 次
发布时间:2019-06-09

本文共 1951 字,大约阅读时间需要 6 分钟。

项目地址: 

文档: 

这个功能强大的很,使用类似js的标签选择器识别DOM

The Ganon library gives access to HTML/XML documents in a very simple object oriented way. It eases modifying the DOM and makes finding elements easy with CSS3-like queries.

Ganon 使用示例:

// Parse the google code website into a DOM$html = file_get_dom('http://code.google.com/');

Access

Accessing elements is made easy through the CSS3-like selectors and the object model.

// Find all the paragraph tags with a class attribute and print the // value of the class attribute foreach($html('p[class]') as $element) {   echo $element->class, "
\n"; } // Find the first div with ID "gc-header" and print the plain text of // the parent element (plain text means no HTML tags, just the text) echo $html('div#gc-header', 0)->parent->getPlainText(); // Find out how many tags there are which are "ns:tag" or "div", but not // "a" and do not have a class attribute echo count($html('(ns|tag, div + !a)[!class]');?>

Modification

Elements can be easily modified after you've found them.

// Find all paragraph tags which are nested inside a div tag, change     // their ID attribute and print the new HTML code     foreach($html('div p') as $index => $element) {       $element->id = "id$index";     }     echo $html;       // Center all the links inside a document which start with "http://"     // and print out the new HTML     foreach($html('a[href ^= "http://"]') as $element) {       $element->wrap('center');     }     echo $html;       // Find all odd indexed "td" elements and change the HTML to make them links     foreach($html('table td:odd') as $element) {       $element->setInnerText(''.$element->getPlainText().'');     }     echo $html;

 

Beautify

Ganon can also help you beautify your code and format it properly.

// Beautify the old HTML code and print out the new, formatted code     dom_format($html, array('attributes_case' => CASE_LOWER));     echo $html;

 

转载于:https://www.cnblogs.com/swocn/p/6731313.html

你可能感兴趣的文章
【codeforces 749C】 Voting
查看>>
【9919】黑暗游戏
查看>>
NPOI 导出
查看>>
Orcle数据库查询练习复习:四
查看>>
JS:面向对象(基础篇)
查看>>
解决Visual Studio 2013 XAML设计器异常
查看>>
Python的虚拟环境virtualenv
查看>>
《网路对抗》Exp8 WEB基础实践
查看>>
分布式技术追踪 2017年第十八期
查看>>
编写shell脚本执行springboot项目 jar包
查看>>
Reporting Services开发步骤
查看>>
Redis内存模型
查看>>
ubuntu16.04 server安装小记
查看>>
2016年 蓝桥杯决赛体验
查看>>
证明根号2为无理数
查看>>
1.Hibernate配置
查看>>
修改Echarts 图表的坐标轴的文本的排列位置
查看>>
2017 acm icpc 沈阳(网络赛)5/12 解题报告
查看>>
Java NIO 学习笔记
查看>>
Linux编译安装MySQL5.6
查看>>