Yahoo! search interface: Difference between revisions

Added Go
(Added Go)
Line 250:
End</lang>
'''[http://www.cogier.com/gambas/Yahoo!%20search%20interface.png Click here to see output (I have typed 'rosettacode' in the search box)]'''
 
=={{header|Go}}==
Yahoo! has evidently changed its search output format over the years and, if it is currently documented anywhere, then I couldn't find it.
 
The regular expression used below was figured out by studying the raw HTML and works fine as at 18th November, 2019.
<lang go>package main
 
import (
"fmt"
"golang.org/x/net/html"
"io/ioutil"
"net/http"
"regexp"
"strings"
)
 
var (
expr = `<h3 class="title"><a class=.*?href="(.*?)".*?>(.*?)</a></h3>` +
`.*?<div class="compText aAbs" ><p class=.*?>(.*?)</p></div>`
rx = regexp.MustCompile(expr)
)
 
type YahooResult struct {
title, url, content string
}
 
func (yr YahooResult) String() string {
return fmt.Sprintf("Title : %s\nUrl : %s\nContent: %s\n", yr.title, yr.url, yr.content)
}
 
type YahooSearch struct {
query string
page int
}
 
func (ys YahooSearch) results() []YahooResult {
search := fmt.Sprintf("http://search.yahoo.com/search?p=%s&b=%d", ys.query, ys.page*10+1)
resp, _ := http.Get(search)
body, _ := ioutil.ReadAll(resp.Body)
s := string(body)
defer resp.Body.Close()
var results []YahooResult
for _, f := range rx.FindAllStringSubmatch(s, -1) {
yr := YahooResult{}
yr.title = html.UnescapeString(strings.ReplaceAll(strings.ReplaceAll(f[2], "<b>", ""), "</b>", ""))
yr.url = f[1]
yr.content = html.UnescapeString(strings.ReplaceAll(strings.ReplaceAll(f[3], "<b>", ""), "</b>", ""))
results = append(results, yr)
}
return results
}
 
func (ys YahooSearch) nextPage() YahooSearch {
return YahooSearch{ys.query, ys.page + 1}
}
 
func main() {
ys := YahooSearch{"rosettacode", 0}
// Limit output to first 5 entries, say, from pages 1 and 2.
fmt.Println("PAGE 1 =>\n")
for _, res := range ys.results()[0:5] {
fmt.Println(res)
}
fmt.Println("PAGE 2 =>\n")
for _, res := range ys.nextPage().results()[0:5] {
fmt.Println(res)
}
}</lang>
 
{{out}}
Note there is some repetition between the pages.
<pre>
PAGE 1 =>
 
Title : Rosetta Code
Url : https://rosettacode.org/wiki/Rosetta_Code
Content: Rosetta Code Rosetta Code is a programming chrestomathy site. Rosetta Code currently has 976 tasks, 231 draft tasks, and is aware of 756 languages, though we do not (and cannot) have solutions to every task in every language. 1 Places to start
 
Title : Rosetta Code - Wikipedia
Url : https://en.wikipedia.org/wiki/Rosetta_Code
Content: Rosetta Code is a wiki -based programming chrestomathy website with implementations of common algorithms and solutions to various programming problems in many different programming languages. 1 Website 1.1 Data and structure 1.2 Languages
 
Title : Rosetta Code (@rosettacode) | Twitter
Url : https://twitter.com/rosettacode
Content: The latest Tweets from Rosetta Code (@rosettacode). Twitter account for http://t.co/DuRZFWDfRn. The general idea here is for short announcements and the like. The ...
 
Title : Best of Rosettacode
Url : https://examples.p6c.dev/categories/best-of-rosettacode.html
Content: 99 Problems Rosettacode Cookbook Euler Games Interpreters Modules Other Grammars Perlmonks Rosalind Shootout ...
 
Title : Rosetta Code Blog
Url : https://blog.rosettacode.org/
Content: (If you point 'rosettacode.com' to RosettaCode.org's IP address, you should still be able to see it) Second, I don't care if you want to use the name 'rosettacode' or 'rosetta code' in similar pursuits. I love that people have been calling task pages that have cropped up on various forums around the web as "rosetta code problems." That speaks ...
 
PAGE 2 =>
 
Title : Rosetta Code | R-bloggers
Url : https://www.r-bloggers.com/rosetta-code/
Content: Rosetta Code is a programming chrestomathy site. The idea is to present solutions to the same task in as many different languages as possible, to demonstrate how languages are similar and different, and to aid a person with a grounding in one approach to a problem in learning another.
 
Title : Best of Rosettacode
Url : https://examples.p6c.dev/categories/best-of-rosettacode.html
Content: 99 Problems Rosettacode Cookbook Euler Games Interpreters Modules Other Grammars Perlmonks Rosalind Shootout ...
 
Title : Rosetta Code Blog
Url : https://blog.rosettacode.org/
Content: (If you point 'rosettacode.com' to RosettaCode.org's IP address, you should still be able to see it) Second, I don't care if you want to use the name 'rosettacode' or 'rosetta code' in similar pursuits. I love that people have been calling task pages that have cropped up on various forums around the web as "rosetta code problems." That speaks ...
 
Title : What exactly is the purpose of Rosetta Code? - Quora
Url : https://www.quora.com/What-exactly-is-the-purpose-of-Rosetta-Code
Content: The name is a play on the Rosetta Stone. The Rosetta Stone featured a decree by King Ptolomy written in three scripts - Egyption hieroglyphs, Demotic, and Ancient Greek.
 
Title : One R Tip A Day: Rosetta Code
Url : https://onertipaday.blogspot.com/2009/07/rosetta-code.html
Content: Today I'd like to suggest the interesting Rosetta Code site: Rosetta Code is a programming chrestomathy site. The idea is to present solutions to the same task in as many different languages as possible, to demonstrate how languages are similar and different, and to aid a person with a grounding in one approach to a problem in learning another.
</pre>
 
=={{header|GUISS}}==
9,476

edits