[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [hobbit] [Hobbit] URLPlus interest - looking for feedback
- To: hobbit (at) hswn.dk
- Subject: Re: [hobbit] [Hobbit] URLPlus interest - looking for feedback
- From: Ralph Mitchell <ralphmitchell (at) gmail.com>
- Date: Fri, 31 Jul 2009 12:57:47 -0500
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=osPKriLmPvceE1Iez25oCSCcV+CYp+j0HjOQ953Umt8=; b=WA3eYLNrRd99FNcUzt/uz6W/yWt1KLuXa5Mq/fNwhnxJUCh1avpi2K1Qv3RxUvA5SD 5NxgaPjoyhZCUlO0Q5wL8MpJRvkUscZ1V1/kGRniuWqTwGj8+IOvF94zrh2FZ2DWoSyQ oqCWjsqtr6+/LEcGPEI0xnp9/i4z8HvNo0RhA=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=AT7orRcbywqX15C2HAMLMleaCBvdGLe5D+2YcOupfbZUwh7ttEQ7iG83uf7BwoZa6j bb5kSaoHX7djRySBTDHKcv3DC3yyO2nX+VVgd2LLgz/R3h7UBqhr8M9nSxg0wzLWzMdn QpJ+2I6jckX/qDgdM2InUw2IkYy4egagAxdYA=
- References: <29f517690907290934q2120eaaewf16287ab9cb6ffa8 (at) mail.gmail.com> <5A14875A0A461B48B967A66281C1DFCC0372A1C7 (at) dhreinsvxb03.messaging.danaherad.com> <997a524e0907310157h6b085626g69e7d50a37f721e4 (at) mail.gmail.com> <29f517690907310934y10d088e4n879a8b7635d253c2 (at) mail.gmail.com>
On Fri, Jul 31, 2009 at 11:34 AM, Gary Baluha <gumby3203 (at) gmail.com> wrote:
> On Fri, Jul 31, 2009 at 4:57 AM, Ralph Mitchell <ralphmitchell (at) gmail.com>wrote:
>
>> I could really have used something like your feature request about 6 years
>> ago. Instead I spent a lot of time handcrafting bash scripts to login to
>> web pages.
>>
>
> Yep, that's kind of how URLPlus got started in the first place ;-)
>
>
>> Don't get me started on the sites that hit you with 5 different types of
>> redirects before reaching the front page, or the sites where each input
>> field is held in it's own personal form. and the submit button executes
>> javascript to copy the values into form full of hidden fields for the actual
>> submittal.
>>
>
> The redirect issue actually isn't too difficult to work around. I have
> been working on a perl program that is capable of more in-depth session
> management than URLPlus is currently capable of, and the solution I'm using
> now seems to work pretty well. My goal is to eventually convert URLPlus
> from using a command-line curl solution, to my current one. This new method
> deals with multi-page redirects better.
>
It's not so much the multi-page redirects using the standard "302: page is
now elsewhere" format, as the other weird ways redirects are sometimes done.
The one that irritated me the most did all of these, in no particular
order:
1) meta-refresh with zero time delay and a new url
2) self-submitting form - i.e. a preloaded form with "form.submit();" at
the end of the html, between script tags
3) self-submitting form - another preloaded form, but with
"onLoad=form.submit();" in the html BODY tag
4) in script tags, change the page location via: top.location="newurl"
5) as above, but use "top.href", or "page.href" or something similar.
I'm not knocking your efforts - you've already done more than I ever did
towards a generic webpage check. I just think that the above are going to
be tricky to handle in an automated way without replicating a large fraction
of a web browser. But, now at least they're documented in the mailing list
for anyone interested in doing their own web checks... :)
> As for the javascript part, that is a bit more difficult.
>
Especially when the page you just downloaded creates the form POST url
on-the-fly from some of the form elements filled in by the user. Yep, saw
that happen too... Another weird page ran a java function to generate a
random character string to include in the url - luckily the function wasn't
too hard to extract and shove through the spidermonkey javascript
interpreter... :)
Ralph Mitchell