最新消息:yaf表单扩展中新增加了浮点数、日期和集合的校验。php yaf框架扩展实践三——表单

Golang语言HTML解析库goquery v 1.0.0正式发布

Golang 463浏览 0评论

goquery是一个使用go语言写成的HTML解析库,可以让你像jQuery那样的方式来操作DOM文档。

下面是示例:

package main

import (
	"fmt"
	"log"

	"github.com/PuerkitoBio/goquery"
)

func ExampleScrape() {
	doc, err := goquery.NewDocument("http://metalsucks.net")
	if err != nil {
		log.Fatal(err)
	} // Find the review items
	doc.Find(".sidebar-reviews article .content-block").Each(func(i int, s *goquery.Selection) { // For each item found, get the band and title  band := s.Find("a").Text()
		title := s.Find("i").Text()
		fmt.Printf("Review %d: %s - %s\n", i, band, title)
	})
}
func main() {
	ExampleScrape()
}

更新日志:

  • 2016-07-27 (v1.0.0) : Tag version 1.0.0.
  • 2016-06-15 : Invalid selector strings internally compile to a Matcher implementation that never matches any node (instead of a panic). So for example, doc.Find(“~”) returns an empty *Selection object.
  • 2016-02-02 : Add NodeName utility function similar to the DOM’s nodeName property. It returns the tag name of the first element in a selection, and other relevant values of non-element nodes (see godoc for details). Add OuterHtml utility function similar to the DOM’s outerHTML property (named OuterHtml in small caps for consistency with the existingHtml method on the Selection).
  • 2015-04-20 : Add AttrOr helper method to return the attribute’s value or a default value if absent. Thanks topiotrkowalczuk.
  • 2015-02-04 : Add more manipulation functions – Prepend* – thanks again to Andrew Stone.
  • 2014-11-28 : Add more manipulation functions – ReplaceWith, Wrap and Unwrap – thanks again to Andrew Stone.
  • 2014-11-07 : Add manipulation functions (thanks to Andrew Stone) and *Matcher functions, that receive compiled cascadia selectors instead of selector strings, thus avoiding potential panics thrown by goquery viacascadia.MustCompile calls. This results in better performance (selectors can be compiled once and reused) and more idiomatic error handling (you can handle cascadia’s compilation errors, instead of recovering from panics, which had been bugging me for a long time). Note that the actual type expected is a Matcher interface, that cascadia.Selectorimplements. Other matcher implementations could be used.
  • 2014-11-06 : Change import paths of net/html to golang.org/x/net/html (seehttps://groups.google.com/forum/#!topic/golang-nuts/eD8dh3T9yyA). Make sure to update your code to use the new import path too when you call goquery with html.Nodes.
  • v0.3.2 : Add NewDocumentFromReader() (thanks jweir) which allows creating a goquery document from an io.Reader.
  • v0.3.1 : Add NewDocumentFromResponse() (thanks assassingj) which allows creating a goquery document from an http response.
  • v0.3.0 : Add EachWithBreak() which allows to break out of an Each() loop by returning false. This function was added instead of changing the existing Each() to avoid breaking compatibility.
  • v0.2.1 : Make go-getable, now that go.net/html is Go1.0-compatible (thanks to @matrixik for pointing this out).
  • v0.2.0 : Add support for negative indices in Slice(). BREAKING CHANGE Document.Root is removed, Document is now aSelection itself (a selection of one, the root element, just like Document.Root was before). Add jQuery’s Closest() method.
  • v0.1.1 : Add benchmarks to use as baseline for refactorings, refactor Next…() and Prev…() methods to use the new html package’s linked list features (Next/PrevSibling, FirstChild). Good performance boost (40+% in some cases).
  • v0.1.0 : Initial release.

转载请注明:快乐编程 » Golang语言HTML解析库goquery v 1.0.0正式发布

发表我的评论
取消评论

表情

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址