Hi all.
I've just started using XBMC, and I immediately found this great scraper that allows me to scrape rotten tomatoes scores. I've seen in previous posts, though, that the RT scraping part is not perfect due to some problems on RT side not directly correctable by the scraper.
I read the documentation on the wiki on the creation of scrapers and I see that some things cannot be done (like showing a second selection window only for RT), but I got some ideas while reading and I was wondering if one of these solutions may actually help in retrieving the correct result (perhaps letting the users decide in scarper settings which RT scraping method to use).
Method 1 - Use RT directly:
- get the usual stuff from IMDB
- query RT the standard way (i.e.
http://www.rottentomatoes.com/search/?search={title + optionally year})
- on the resulting page, check all presented results through regexp looking for the one that matches the title AND the year (or whose year is closest - not sure this can be done)
- get the actual movie url and scrape data as usual
I'm not sure this is a viable option, it's a variation (simplification?) of what has already been proposed and said not possible. I was wondering though if, using nested regexp and custom functions, the whole searching-in-results thing was possible...
Method 2 - Use good, old google:
- get the usual stuff from IMDB
- search google for title, year and site (with site:rottentomatoes.com, something like:
http://www.google.com/#hl=en&q={title}+{year} +site:rottentomatoes.com)
- get the very first result and hope for the best
I'm a bit more confident about the viability of this second option, even though I'm not sure the "double jump" (google first, RT later) is doable with the current framework. I wasn't able to find enough documentation on the wiki regarding this possibility...