Unable to mimic POST request with headers – VBA
I have spent about 15 days trying to make a scraper in VBA, I’ve been making decent progresses day after day but these last two days I got stuck in the very last step to get the data.
This is a continuation of my previous post, which gave me a good guide to start.
Here’s the process I want to simulate usign MSXML (not Internet Explorer)
- Open https://beacon.schneidercorp.com/
- Select "Iowa State"
- Select "Boone County, IA"
- Click on the popup link "Property Search"
- In the top red ribbon, click on the "Comp Search" label
- At the bottom of the resulting page, in the "Agricultural Comparables Search" section check the "Sale Date" checkbox
- Select 5 months in the "Sale Date" combobox
- Click on the "Search" button at the bottom of the "Agricultural Comparables Search" section
- In the resulting page, look for "Parcel ID" identified as "088327354400002" and click on the link on the "Recording" column (value "2020-0418")
I could achieve the first 8 steps but I haven’t been able to get URL of the results that should be get from that last link held in "2020-0418"
As I did to get from the 8th to the 9th step, I noticed that inside the Development ToolKit’s "Network" Tab, the website sent a POST request, as shown below.
**General** Request URL: https://beacon.schneidercorp.com/Application.aspx?AppID=84&LayerID=795&PageTypeID=3&PageID=579&Q=1926372975 Request Method: POST Status Code: 302 Remote Address: 52.168.93.150:443 Referrer Policy: no-referrer-when-downgrade **Response Headers** alt-svc: quic=":443"; ma=2592000; v="44,43,39" cache-control: private content-encoding: gzip content-length: 187 content-type: text/html; charset=utf-8 date: Sat, 27 Jun 2020 00:46:42 GMT location: /Application.aspx?AppID=84&LayerID=795&PageTypeID=3&PageID=551&Q=1603287013 status: 302 vary: Accept-Encoding **Request Headers** :authority: beacon.schneidercorp.com :method: POST :path: /Application.aspx?AppID=84&LayerID=795&PageTypeID=3&PageID=579&Q=1926372975 :scheme: https accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed- exchange;v=b3;q=0.9 accept-encoding: gzip, deflate, br accept-language: es-ES,es;q=0.9 cache-control: max-age=0 content-length: 395 content-type: application/x-www-form-urlencoded cookie: _ga=GA1.2.1299682399.1590279064; MODULES508=; MODULESVISIBILE508=18469; MODULES1024=; MODULESVISIBILE1024=29489%7C29501; MODULES501=; MODULESVISIBILE501=10310; _gid=GA1.2.449363625.1593013300; ASP.NET_SessionId=4xwgdh2cqto0kugirkani4vp; _gat=1 origin: https://beacon.schneidercorp.com referer: https://beacon.schneidercorp.com/Application.aspx?AppID=84&LayerID=795&PageTypeID=3&PageID=579&Q=1926372975 sec-fetch-dest: document sec-fetch-mode: navigate sec-fetch-site: same-origin sec-fetch-user: ?1 upgrade-insecure-requests: 1 user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36 Query String Parameters (source view) AppID=84&LayerID=795&PageTypeID=3&PageID=579&Q=1926372975 **Form Data (source view)** __EVENTTARGET=ctlBodyPane%24ctl02%24ctl01%24gvwAgCompResults%24ctl05%24lnkRecording&__EVENTARGUMENT=&__VIEWSTATE=cbg8zdrx99ofbjcpw9%2FCE8J0v2SY5W86N%2Fbx%2FU0CsnNPy9D3bcg%2F5YstkCGTwd03lObnZbF9%2B5QuO1lP658HYgyXsOmpImGVjhn47teNdO788MngiEN9qzZbzrOv8jZAd93B8QXltxoPV5dLVu0%2BELpETwwTteNsmbKNEr1IpBz2aSxsN1spJUTKy42SUE37HkdUqVpsQlCPHPyIomJH4b6CoepL2uG9y45pMbUYFZxPG5ob&__VIEWSTATEGENERATOR=569DB96F
Next, I show a sample of my code
Sub ScrapingTest() Dim XMLpagina As New MSXML2.ServerXMLHTTP60 Dim htmlDoc As New MSHTML.htmlDocument Dim strURL As String, strBodyRequest As String Dim strETarget As String, strVState As String Dim strT1 As String, strT2 As String, strT3 As String Dim strPageID As String, strPageTypeID As String '==================== 'FOR VIEWING PURPOSES I ONLY SHOW A SHORT VERSION OF MY ORIGINAL CODE, 'TRYING TO CUT AS MUCH AS POSSIBLE... '==================== 'OPENING Comp Search Website - STEP 6 strURL = "https://beacon.schneidercorp.com/Application.aspx?AppID=84&LayerID=795&PageTypeID=2&PageID=578" 'SEND GET REQUEST XMLpagina.Open "GET", strURL, False XMLpagina.send htmlDoc.body.innerHTML = XMLpagina.responseText Call generarCopiaHtml(XMLpagina) 'GETTING THE VALUES TO BE SEND ON THE REQUESTBODY OF THE NEXT REQUEST 'GET THE EVENTTARGET strETarget = "ctlBodyPane$ctl02$ctl01$btnSearch" 'I DON'T SCRAPE FOR THIS BECAUSE IT'S ALWAYS THE SAME 'GET THE VIEWSTATE VALUE strT1 = "<input type='hidden' name='__VIEWSTATE' id='__VIEWSTATE' value='" strT1 = Replace(strT1, "'", """") strT2 = "' />" strT2 = Replace(strT2, "'", """") strVState = extraeVerg(XMLpagina.responseText, strT1, strT2) 'THIS CUSTOM FUNCTION EXTRACTS A TEXT LAYING BETWEEEN strT1 AND strT2 'SETS THE REQUESTBODY strBodyRequest = "__EVENTTARGET=ctlBodyPane%24ctl02%24ctl01%24btnSearch" strBodyRequest = strBodyRequest & "&__EVENTARGUMENT=" strBodyRequest = strBodyRequest & "&__VIEWSTATE=" & strVState strBodyRequest = strBodyRequest & "&__VIEWSTATEGENERATOR=569DB96F" strBodyRequest = strBodyRequest & "&ctlBodyPane%24ctl02%24ctl01%24chkUseSaleDate=on" strBodyRequest = strBodyRequest & "&ctlBodyPane%24ctl02%24ctl01%24cboSaleDate=5" 'DEFINES HOW MANY MONTHS THE SEARCH WILL GO THROUGH strBodyRequest = strBodyRequest & "&ctlBodyPane%24ctl02%24ctl01%24txtSaleDateHigh_VCS3Ag=" & Month(Now) & "%2F" & Day(Now) & "%2F" & Year(Now) strBodyRequest = strBodyRequest & "&ctlBodyPane%24ctl02%24ctl01%24txtCSRPointsHigh=" 'OPENING Comp Search Website(SHOWING RESULTS)- STEP 9 'SEND THE REQUEST XMLpagina.Open "POST", strURL, False XMLpagina.setRequestHeader "Content-type", "application/x-www-form-urlencoded" XMLpagina.setRequestHeader "Content-Length", Len(strBodyRequest) XMLpagina.send strBodyRequest 'GENERATE A LOCAL COPY OF THE RESPONSE Call generarCopiaHtml(XMLpagina) 'BUILDING THE URL FOR THE NEXT REQUEST strT1 = "{'Name':'Comp Results','PageId':" strT1 = Replace(strT1, "'", """") strT2 = ",'PageTypeId':" strT2 = Replace(strT2, "'", """") strT3 = ",'Icon" strT3 = Replace(strT3, "'", """") strPageID = extraeVerg(XMLpagina.responseText, strT1, strT2) strPageTypeID = extraeVerg(XMLpagina.responseText, strT1 & strPageID & strT2, strT3) 'THE strURL MUST BE EXACTLY LIKE "https://beacon.schneidercorp.com/Application.aspx?AppID=84&LayerID=795&PageTypeID=3&PageID=579" strURL = "https://beacon.schneidercorp.com/Application.aspx?AppID=84&LayerID=795&PageTypeID=" & strPageTypeID & "&PageID=" & strPageID 'GETTING THE VALUES TO BE SEND ON THE REQUESTBODY strT1 = "<input type='hidden' name='__VIEWSTATE' id='__VIEWSTATE' value='" strT1 = Replace(strT1, "'", """") strT2 = "' />" strT2 = Replace(strT2, "'", """") strVState = extraeVerg(XMLpagina.responseText, strT1, strT2) 'SETS THE REQUESTBODY strETarget = "ctlBodyPane$ctl02$ctl01$gvwAgCompResults$ctl45$lnkRecording" 'THIS VALUE MIMICS THE CLICK ON THE RECORD "2020-0418" RELATED TO THE PARCEL ID "088327354400002" strBodyRequest = "__EVENTTARGET=" & Application.WorksheetFunction.EncodeURL(strETarget) strBodyRequest = strBodyRequest & "&__EVENTARGUMENT=" strBodyRequest = strBodyRequest & "&__VIEWSTATE=" & strVState strBodyRequest = strBodyRequest & "&__VIEWSTATEGENERATOR=569DB96F" 'SEND THE REQUEST XMLpagina.Open "POST", strURL, False XMLpagina.setRequestHeader "Content-type", "application/x-www-form-urlencoded" XMLpagina.setRequestHeader "Content-Length", Len(strBodyRequest) XMLpagina.send strBodyRequest 'GENERATE A LOCAL COPY OF THE RESPONSE Call generarCopiaHtml(XMLpagina) 'ON THIS POINT I SHOULD BE GETTING INSIDE THE "Results" WEBSITE WITH AN URL LIKE THIS ' "https://beacon.schneidercorp.com/Application.aspx?AppID=84&LayerID=795&PageTypeID=3&PageID=551" 'WHICH GIVES A LIST OF THE PARCELS INVOLVED IN THE SALE, BUT IT STILL SHOWS THE LAST PAGE RESULTS... 'I CAN'T SEE WHAT AM I DOING WRONG... End Sub
My real goal is to repeat this process to get data from some specific sales of all Iowa State Counties, but when I do the first all other won’t be a problem.
Can someone show me what am I doing to wrong to make this work?
PS1: I apologize for another question related to the same problem, that I made about ten days ago, which was wrong from top to bottom, I was so tired then that I wrote some crazy stuff.
PS2: Out there seems to be a lot of information about this, but whether I’m not prepared enough to get the solution or my case is not too frequent.