jenkins - Parse HTML file using Python without external module -


i trying parse html file using python without using external module. reason triggering jenkins job , running import issues lxml , beautifulsoup (tried resolving , think somewhere doing on engineering stuff done)

input:

    <tr class="test">     <td class="test">       <a href="a.html">ba</a>     </td>     <td class="duration">       0.000s     </td>          <td class="zero number">0</td>          <td class="zero number">0</td>          <td class="zero number">0</td>      <td class="passrate">             n/a           </td>   </tr>    <tr class="test">     <td class="test">       <a href="o.html">aa</a>     </td>     <td class="duration">       0.000s     </td>          <td class="zero number">0</td>          <td class="zero number">0</td>          <td class="zero number">0</td>      <td class="passrate">             n/a           </td>   </tr>    <tr class="test">     <td class="test">       <a href="g.html">videoads</a>     </td>     <td class="duration">       0.390s     </td>          <td class="zero number">0</td>          <td class="zero number">0</td>          <td class="zero number">0</td>      <td class="passrate">             n/a           </td>   </tr>    <tr class="suite">     <td colspan="2" class="totallabel">total</td>          <td class="zero number">271</td>          <td class="zero number">0</td>          <td class="zero number">3</td>      <td class="passrate suite">             98%           </td>    </tr> 

output:

i want take specific block of tr tag class "suite" (check @ end) , pull values 0 number, 0 number, 0 number , passrate suite. finally, print values.

~~~~~~~~~~~~~~~~~~~~~~~~~~

eg. 0 number = 271 ...

pass rate = 98%

~~~~~~~~~~~~~~~~~~~~~~~~~~ here tried lxml:

tree = parse(html_file) tds = tree.xpath("//tr[@class='suite']//td/text()") val = map(str.strip, tds) 

this works out locally want without external dependencies. shall use strip() or open file using os.path.isfile(). may not correct advise/walk me through solution this.

for 1 element try use re module or string functions.

data = '''<tr class="test"> <td class="test"> <a href="no.html">track</a></td> <td class="duration">0.390s</td> <td class="zero number">0</td> <td class="zero number">0</td> <td class="zero number">0</td> <td class="passrate">n/a</td></tr>  <tr class="suite"> <td colspan="2" class="totallabel">total</td> <td class="passed number">271</td> <td class="zero number">0</td> <td class="failed number">3</td> <td class="passrate suite">98%</td> </tr>'''  # re module  import re  print(re.search('suite">(\d+)%', data).group(1))  # string functions  before = 'passrate suite">' after  = '%' start = data.find(before) + len(before) stop  = data.find(after, start)  print(data[start:stop]) 

edit: othere values re

import re  print('passed:', re.search('passed number">(\d+)', data).group(1)) print('zero:', re.search('zero number">(\d+)', data).group(1)) print('failed:', re.search('zero number">(\d+)', data).group(1)) print('rate:', re.search('suite">(\d+)', data).group(1))  passed: 271 zero: 0 failed: 0 rate: 98 

Comments

Popular posts from this blog

Delphi XE2 Indy10 udp client-server interchange using SendBuffer-ReceiveBuffer -

Qt ActiveX WMI QAxBase::dynamicCallHelper: ItemIndex(int): No such property in -

Enable autocomplete or intellisense in Atom editor for PHP -