Curl works, JSOUP returns HTTP Error 500 -
i'm trying web scrape java, , plan throw code android, @ moment i'm trying jsoup. using chrome's devtools, pulled request headers , curl command return data webpage. can run following command in curl , works:
curl 'mysite/campaign/list' -h 'cookie: __requestverificationtoken_l0n5yxjhv2viug9ydgfs0=iecny-sonb09iy9mqmm3xl1bsbase8eha9j1fwupurhtmlldojgqpaljhziuhffh6zrnoygjsrkyuhj2krwissnxif76grnh_39lgvymj0i1; asp.net_sessionid=gojtobwzycl0lvs0ip4glf3n; mycompany.web.portal.auth=40c13baf08884380f805b99e217754f3d35920ce1861debb580dc143da4249c4682c33a36dd29272a3a844880110e4d0ec1f24298e4d1b2a4a94e3fa2cac08b934989acf155616d6cb5665338ff3cff82ead87bf93eb46fa3ba6aae6b00401f9' -h 'origin: mysite' -h 'accept-encoding: gzip, deflate' -h 'accept-language: en-us,en;q=0.8' -h 'user-agent: mozilla/5.0 (windows nt 6.1; wow64) applewebkit/537.36 (khtml, gecko) chrome/48.0.2564.97 safari/537.36' -h 'content-type: application/json;charset=utf-8' -h 'accept: */*' -h 'referer: mysite/campaign' -h 'x-requested-with: xmlhttprequest' -h 'connection: keep-alive' -h '__requestverificationtoken: g2rd7fthmg12j00znultiszswquxaovh1hunxobxmcfizclrqueao4d3czoni1mz7hxell56yi5hci5vpc78m4sh8pivhwrckimccibi9xk1' --data-binary '{"pagenumber":2,"sortcolumn":"scheduledrundate","sortascending":false,"pagesize":20,"collectionsize":308,"selectedaccountid":"1","searchterm":"","showinactive":true}' --compressed
i pulled headers request headers chrome devtools:
post mysite/campaign/list http/1.1 host: mysite connection: keep-alive content-length: 165 origin: mysite user-agent: mozilla/5.0 (windows nt 6.1; wow64) applewebkit/537.36 (khtml, gecko) chrome/48.0.2564.97 safari/537.36 content-type: application/json;charset=utf-8 accept: */* x-requested-with: xmlhttprequest __requestverificationtoken: g2rd7fthmg12j00znultiszswquxaovh1hunxobxmcfizclrqueao4d3czoni1mz7hxell56yi5hci5vpc78m4sh8pivhwrckimccibi9xk1 referer: mysite/campaign accept-encoding: gzip, deflate accept-language: en-us,en;q=0.8 cookie: __requestverificationtoken_l0n5yxjhv2viug9ydgfs0=iecny-sonb09iy9mqmm3xl1bsbase8eha9j1fwupurhtmlldojgqpaljhziuhffh6zrnoygjsrkyuhj2krwissnxif76grnh_39lgvymj0i1; asp.net_sessionid=gojtobwzycl0lvs0ip4glf3n; mycompany.web.portal.auth=40c13baf08884380f805b99e217754f3d35920ce1861debb580dc143da4249c4682c33a36dd29272a3a844880110e4d0ec1f24298e4d1b2a4a94e3fa2cac08b934989acf155616d6cb5665338ff3cff82ead87bf93eb46fa3ba6aae6b00401f9
i try converting jsoup , no luck. tried using headers, , using headers along pagenumber, scheduledrundate, etc. passed. both attempts return org.jsoup.httpstatusexception: http error fetching url. status=500. here code i'm attempting:
document pagedoc = jsoup.connect("mysite/campaign/list") .cookies(logincookies) //.header("cookie",cookielist) .useragent("mozilla/5.0") .referrer("mysite/campaign") //.data("username", username) //.data("password", password) //.followredirects(true) .header("accept","*/*") .header("accept-encoding","gzip, deflate") .header("accept-language","en-us,en;q=0.8") .header("connection","keep-alive") .header("content-type", "application/json;charset=utf-8") .header("host","mysite") .header("origin", "mysite") .header("referer","mysite/campaign") .header("user-agent","mozilla/5.0 (windows nt 6.1: wow64) applewebkit/537.36 (khtml, gecko) chrome/48.0.2564.97 safari/537.36") .header("x-requested-with", "xmlhttprequest") .header("__requestverificationtoken", pagetoken) .header("content-length", "165") //not sure if needed. if is, no idea how .data("pagenumber","2") .data("sortcolumn", "scheduledrundate") .data("sortascending", "false") .data("pagesize", "20") .data("collectionsize", "308") .data("selectedaccountid", "1") .data("searchterm", "") .data("showinactive", "true") .ignorecontenttype(true) .post();
i can confirm tokens correct. when comment out .header("x-requested-with", "xmlhttprequest") receive general error page (this expected) know i'm connecting, when leave in 500. can confirm "mysite" links correct, have remove them per company. i'm not sure if , how need add pagenumber, sortcolumn, sortascending etc. jsoup blindly added them data parameters shown above.
try remove header("content-length", "165")
, .header("content-type", "application/json;charset=utf-8")
. jsoup can add them you.
try use formelement
also. see formelement example.
Comments
Post a Comment