Login at Kodi Home

troy_pdx · 2016-07-10, 04:56

Hi!

I'm attempting to use lxml to scrape some table data and present it in a dialog as in the simple example below.

Code:
import xbmcaddon

import xbmcgui

from lxml import html

import requests

addon     = xbmcaddon.Addon()

addonname = addon.getAddonInfo('name')

page = requests.get('http://econpy.pythonanywhere.com/ex/001.html')

tree      = html.fromstring(page.content)

buyers = tree.xpath('//div[@title="buyer-name"]/text()')

prices = tree.xpath('//span[@class="item-price"]/text()')

line1 = buyers[0]

line2 = prices[0]

xbmcgui.Dialog().ok(addonname,line1,line2)

The first run works but any subsequent runs cause a TypeError exception.

Quote:19:44:14 T:139666170165248 ERROR: GetDirectory - Error getting plugin://plugin.program.flypdx/
19:44:14 T:139666170165248 ERROR: CGUIMediaWindow::GetDirectory(plugin://plugin.program.flypdx/) failed
19:44:16 T:139665037612800 ERROR: EXCEPTION Thrown (PythonToCppException) : -->Python callback/script returned the following error<--
- NOTE: IGNORING THIS CAN LEAD TO MEMORY LEAKS!
Error Type: <type 'exceptions.TypeError'>
Error Contents: 'NoneType' object is not callable
Traceback (most recent call last):
File "/home/troys/.kodi/addons/plugin.program.flypdx/addon.py", line 13, in <module>
buyers = tree.xpath('//div[@title="buyer-name"]/text()')
File "lxml.etree.pyx", line 1509, in lxml.etree._Element.xpath (src/lxml/lxml.etree.c:50702)
File "xpath.pxi", line 318, in lxml.etree.XPathElementEvaluator.__call__ (src/lxml/lxml.etree.c:145954)
File "xpath.pxi", line 241, in lxml.etree._XPathEvaluatorBase._handle_result (src/lxml/lxml.etree.c:144987)
File "extensions.pxi", line 621, in lxml.etree._unwrapXPathObject (src/lxml/lxml.etree.c:139973)
File "extensions.pxi", line 655, in lxml.etree._createNodeSetResult (src/lxml/lxml.etree.c:140328)
File "extensions.pxi", line 676, in lxml.etree._unpackNodeSetEntry (src/lxml/lxml.etree.c:140524)
File "extensions.pxi", line 800, in lxml.etree._buildElementStringResult (src/lxml/lxml.etree.c:141835)
File "extensions.pxi", line 749, in lxml.etree._elementStringResultFactory (src/lxml/lxml.etree.c:141331)
TypeError: 'NoneType' object is not callable
-->End of Python script error report<--
19:44:16 T:139666170165248 ERROR: GetDirectory - Error getting plugin://plugin.program.flypdx/
19:44:16 T:139666170165248 ERROR: CGUIMediaWindow::GetDirectory(plugin://plugin.program.flypdx/) failed
19:44:19 T:139665037612800 ERROR: EXCEPTION: Non-Existent Control 1
19:44:22 T:139665037612800 WARNING: Attempt to use invalid handle -1

Am I on the right track or should I be looking at the traditional XML-based scheme used for media scraping?

troy_pdx · 2016-07-12, 20:49

Here's my new approach that employs urllib2 to avoid the problem.

Code:
import xbmcaddon

import xbmcgui

import re, requests, urllib2, urllib

addon     = xbmcaddon.Addon()

addonname = addon.getAddonInfo('name')

url       = 'http://econpy.pythonanywhere.com/ex/001.html'

req       = urllib2.Request(url)

response  = urllib2.urlopen(req)

link      = response.read()

response.close()

buyers = re.compile('<div title="buyer-name">(.+?)</div>').findall(link)

prices = re.compile('<span class="item-price">(.+?)</span>').findall(link)

line1 = buyers[0]

line2 = prices[0]

xbmcgui.Dialog().ok(addonname,line1,line2)