Why Isn't For Loop Web Scraping for Href? I Appreciate Anyone Who Can Figure This Out

Question

Home

Why Isn't For Loop Web Scraping for Href? I Appreciate Anyone Who Can Figure This Out

0

On the url in the "sitemap" below, I am trying to scrape the hyperlinked "Visit Website" for each office on this page.

However, I believe I am making an error with my "listings_div" variable as it does not seem to capture all the offices when I do the for loop. Thank you for your help!!

import requests from bs4 import BeautifulSoup  sitemap = 'https://www.bhhs.com/office-results-list?office_country=US' sitemap_content = requests.get(sitemap).content soup = BeautifulSoup(sitemap_content, 'html.parser')  listings_div = soup.find('section', attrs={'class':'cmp-office-search-results'})  for state in listings_div.find_all('div', attrs={'class':'cmp-office-results-list-view__content'}):     print(state.find('section', attrs={'class':'cmp-cta'}).get('href'))

Devinestebanmarjorie Asked on July 16, 2020 in Python.

Share
Comment(0)

Add Comment

1 Answer(s)

Votes
Oldest

0

Your job is much easier now. The website uses javascript to get this information.

The below scrapes all the 141 pages.

import requests, json  results = []  for i in range(1,142):     res = requests.get("https://www.bhhs.com/bin/bhhs/officeSearchServlet?PageSize=10&Sort=1&Page={}&office_country=US".format(i))     results.append(res.json())  with open("result.json", "w") as f:     json.dump(results, f)

Trying all the requests at once can make some requests failed. Hence, I recommend crawling the pages in batches and save the data like pages from 1-10 save the data, next 10-20 save the data etc… Next you can consolidate all the scraped results

Fredclarissaliz Answered on July 16, 2020.

Share
Comment(0)

Add Comment

Your Answer

Answer 1

BuddyPress is a plugin for WordPress that enables you to create a social network or community website. It has all the...

Answer 2

I value you getting some margin to help me with this task. Without you, no part of this would have...

Answer 3

Try to define a Cohesive class, until and unless the methods are written relevant to the class and it defines...

Answer 4

Try to add exportAllData: true, as an other option, hope it helps :)

Answer 5

DataSet can read an XML, infer schema and create a tabular representation that's easy to manipulate: DataSet ip1 = new...

Answer 6

I created a class and used Xml Linq : using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Xml; using...

Answer 7

XDocument first = XDocument.Load(args[0]); XDocument second = XDocument.Load(args[1]); var result = new XElement( "ipaddresses", first.Root.Elements("ip") .Zip(second.Root.Elements("ip"), (f, s) => {...

Answer 8

Following your code for the header row, you could achieve this by an <xsl:apply-templates select="/report/order_actions/order_action[order_id = current()/order_id]" /> As well...

Answer 9

BuddyPress is a plugin for WordPress that enables you to create a social network or community website. It has all the...

Answer 10

I value you getting some margin to help me with this task. Without you, no part of this would have...

Answer 11

Try to define a Cohesive class, until and unless the methods are written relevant to the class and it defines...

Answer 12

Try to add exportAllData: true, as an other option, hope it helps :)

Answer 13

DataSet can read an XML, infer schema and create a tabular representation that's easy to manipulate: DataSet ip1 = new...

Answer 14

I created a class and used Xml Linq : using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Xml; using...

Answer 15

XDocument first = XDocument.Load(args[0]); XDocument second = XDocument.Load(args[1]); var result = new XElement( "ipaddresses", first.Root.Elements("ip") .Zip(second.Root.Elements("ip"), (f, s) => {...

Answer 16

Following your code for the header row, you could achieve this by an <xsl:apply-templates select="/report/order_actions/order_action[order_id = current()/order_id]" /> As well...

LATEST ANSWERS

Why Isn't For Loop Web Scraping for Href? I Appreciate Anyone Who Can Figure This Out

Your Answer

TOP USERS

HOT QUESTIONS

LATEST ANSWERS

Why Isn't For Loop Web Scraping for Href? I Appreciate Anyone Who Can Figure This Out

Your Answer

Tags Widget

TOP USERS

HOT QUESTIONS