Avoid hex-encoding in a text-file for a searching program in python -
i have written python program analyze server log(a text file) , find non matching strings user input. anyway hex-encoded strings not considered program. ex : in following case program says there no non-matching values user input although 'www.peoplesmonton.com' available. please me avoid this?
for line in lines: match = re.search('\\b' + userinput + '\\b',line)
sample text file:
https://www.mysite.com/myworks/accaply/inquiry.asp http://www.peoplesmonton.com/amb/cgi-bin/bank/bank/ambt%20bank%20of%20frnak%20plc_asp.htm http://www.peoplesmonton.com/comblk/cgi-bin/bank/bank/ambt%20bank%20of%20ambt%20plc_asp.htm
the information url encoded, use urllib2.unquote
decode that.
>>> input = '''\ ... https://www.mysite.com/myworks/accaply/inquiry.asp ... http://www.peoplesmonton.com/amb/cgi-bin/bank/bank/ambt%20bank%20of%20frnak%20plc_asp.htm ... http://www.peoplesmonton.com/comblk/cgi-bin/bank/bank/ambt%20bank%20of%20ambt%20plc_asp.htm ... ''' >>> import urllib2 >>> print urllib2.unquote(input) https://www.mysite.com/myworks/accaply/inquiry.asp http://www.peoplesmonton.com/amb/cgi-bin/bank/bank/ambt bank of frnak plc_asp.htm http://www.peoplesmonton.com/comblk/cgi-bin/bank/bank/ambt bank of ambt plc_asp.htm
Comments
Post a Comment