Kodi Community Forum
Release TMDb TV Show scraper (Python - Default Matrix Scraper) - Printable Version

+- Kodi Community Forum (https://forum.kodi.tv)
+-- Forum: Support (https://forum.kodi.tv/forumdisplay.php?fid=33)
+--- Forum: Add-on Support (https://forum.kodi.tv/forumdisplay.php?fid=27)
+---- Forum: Information Providers (scrapers) (https://forum.kodi.tv/forumdisplay.php?fid=147)
+----- Forum: TV Show Scrapers (https://forum.kodi.tv/forumdisplay.php?fid=305)
+----- Thread: Release TMDb TV Show scraper (Python - Default Matrix Scraper) (/showthread.php?tid=357232)

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - Balo - 2021-06-26

(2021-06-26, 12:18)Karellen Wrote:
(2021-06-26, 12:08)Balo Wrote: it won't show up some posters
This scraper does not scrape every available image at TMDB. It needs to limit how many artwork images are scraped.

There is logic that limits artwork to 300 images total. This includes the images for posters, fanart, banner, clearlogo, clearart, discart, landscape, and then all the Season posters, banners and thumbs. It does not include the episode images. These are separate.

TheMovieDB Python for movies limits it to 10 fanart images.

Scraping every single artwork link available was causing problems for many users, especially those with large libraries and using MySQL. Also the fact that links can go stale. The online sites purge a lot of their artwork so you end up with a database with hundreds or thousands of broken links.
I don't think thats the case, there's hardly any art for this series. It just dosnt want to load that art. It is listed as "Japanese posters", I can only think thats the cause it skips them, but shouldn't cause then it will skip most art for anime series.


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - Karellen - 2021-06-26

(2021-06-26, 12:30)Balo Wrote: I don't think thats the case, there's hardly any art for this series.
I am tired of guessing. Give me the link to the show. Tell me what language you are trying to scrape. Provide a Debug Log that captures the scrape.


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - Balo - 2021-06-27

(2021-06-26, 13:10)Karellen Wrote:
(2021-06-26, 12:30)Balo Wrote: I don't think thats the case, there's hardly any art for this series.
I am tired of guessing. Give me the link to the show. Tell me what language you are trying to scrape. Provide a Debug Log that captures the scrape.
It is related to the language of the art in tmdb. I uploaded some art and categorized as "english posters" and they show up in kodi now. It seems if you set your add-on language other than english (I have it set on spanish) the posters uploaded on non-english are not loaded. IE: it only takes spanish and english as fallback and ignores others.


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - pkscout - 2021-06-27

(2021-06-27, 13:11)Balo Wrote:
(2021-06-26, 13:10)Karellen Wrote:
(2021-06-26, 12:30)Balo Wrote: I don't think thats the case, there's hardly any art for this series.
I am tired of guessing. Give me the link to the show. Tell me what language you are trying to scrape. Provide a Debug Log that captures the scrape.
It is related to the language of the art in tmdb. I uploaded some art and categorized as "english posters" and they show up in kodi now. It seems if you set your add-on language other than english (I have it set on spanish) the posters uploaded on non-english are not loaded. IE: it only takes spanish and english as fallback and ignores others.

If you donʻt provide an example show and a debug log, we arenʻt going to be able to look into this.


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - pkscout - 2021-07-18

Just an FYI that there is a release coming soon to fix the IMDB ratings (IMDB updated their web site, which broke the HTML scrape to get the ratings).  Special thanks to gumida for reporting this and submitting a patch to fix it.  Please note this fix will be available for Matrix only. No further updates are being done for Leia.


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - littlejeem - 2021-08-02

Hello Folks

Im looking for some help trying to nail down an issue i'm having getting TMDB (TV Shows) to scrape a post for a TV Show.

The show is: 71365-battlestar-galactica, now I know this show is a bit of a pitta and I've checked out the link here that really helped with the naming. I am specifically looking at the 2003 Series (essentially a mini-series with two files) scraped by TMDB.

Kodi / the scraper pulls in the data for the TV Show and displays it fine in all parts...
https://i.imgur.com/YKijdwV.jpg
Image

...except the poster and I can't figure out whats going wrong.
https://i.imgur.com/QvYbTYM.jpg
Image

I've attached the debug log here. This covers the period after i've removed the series from the the library and rebooted, basically library update.

Any help much appreciated

Sys Info
Distributor ID: Ubuntu
Description: Ubuntu 18.04.5 LTS
Release: 18.04
Codename: bionic
Kernel is: 4.15.0-151-generic
Kodi (19.1 (19.1.0) Git:20210509-85e05228b4)

littlejeem


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - pkscout - 2021-08-03

(2021-08-02, 16:39)littlejeem Wrote: Hello Folks

Im looking for some help trying to nail down an issue i'm having getting TMDB (TV Shows) to scrape a post for a TV Show.

The show is: 71365-battlestar-galactica, now I know this show is a bit of a pitta and I've checked out the link here that really helped with the naming. I am specifically looking at the 2003 Series (essentially a mini-series with two files) scraped by TMDB.

I've attached the debug log here. This covers the period after i've removed the series from the the library and rebooted, basically library update.

Any help much appreciated

Sys Info
Distributor ID: Ubuntu
Description: Ubuntu 18.04.5 LTS
Release: 18.04
Codename: bionic
Kernel is: 4.15.0-151-generic
Kodi (19.1 (19.1.0) Git:20210509-85e05228b4)

littlejeem

That log shows you using the old (but still installed) XML based scraper, not the Python based one.  If you want to use the new scraper, you need to update your sources.  If you upgraded from Kodi 18, Kodi 19 maintained your scraper choices, so even though TMDb TV Shows is the default scraper, it doesn't replace your old choices.  Also, it looks like you haven't updated your addons in awhile. The Python scraper version shows 1.4.0, but 1.4.6 is the current one, so you should update before making any other changes.


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - littlejeem - 2021-08-03

(2021-08-03, 01:59)pkscout Wrote:
(2021-08-02, 16:39)littlejeem Wrote: Hello Folks

Im looking for some help trying to nail down an issue i'm having getting TMDB (TV Shows) to scrape a post for a TV Show.

The show is: 71365-battlestar-galactica, now I know this show is a bit of a pitta and I've checked out the link here that really helped with the naming. I am specifically looking at the 2003 Series (essentially a mini-series with two files) scraped by TMDB.

I've attached the debug log here. This covers the period after i've removed the series from the the library and rebooted, basically library update.

Any help much appreciated

Sys Info
Distributor ID: Ubuntu
Description: Ubuntu 18.04.5 LTS
Release: 18.04
Codename: bionic
Kernel is: 4.15.0-151-generic
Kodi (19.1 (19.1.0) Git:20210509-85e05228b4)

littlejeem

That log shows you using the old (but still installed) XML based scraper, not the Python based one.  If you want to use the new scraper, you need to update your sources.  If you upgraded from Kodi 18, Kodi 19 maintained your scraper choices, so even though TMDb TV Shows is the default scraper, it doesn't replace your old choices.  Also, it looks like you haven't updated your addons in awhile. The Python scraper version shows 1.4.0, but 1.4.6 is the current one, so you should update before making any other changes.

Thank you @pkscout, I had no idea I was running something out of date..I'll have a look and see if I can figure out what I need to do.


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - schumi2004 - 2021-08-03

Today i started from scratch with my TV show collection and switched to the default Matrix scraper (i'm on Kodi 19.1).
I noticed for 3 shows it didn't scrape correctly and only produced the following errors in logs for every episode in the series:

....... online, but we have no episode guide. Check your tvshow.nfo and make sure the <episodeguide> tag is in place.

The shows it did for (3 out of 49) were:

The Walking Dead
Game of Thrones
Supernatural


All others did not gave any issues. Switched back to The TVDB (new) for these 3 shows and it retrieved the information correctly.
I also checked TMDB for the specific shows but i did not see any issues with them, i'm also not using any NFO files as the error message suggested.

Any suggestions?


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - Karellen - 2021-08-04

(2021-08-03, 23:38)schumi2004 Wrote: ....... online, but we have no episode guide. Check your tvshow.nfo and make sure the <episodeguide> tag is in place.
Need the full Debug Log which captures you scraping those shows.
But the message certainly suggests nfo files or the wrong episode guide URL in your database, especially as it works when you switch over to TVDB indicating you have a TVDB episodeguide URL.


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - schumi2004 - 2021-08-04

(2021-08-04, 00:05)Karellen Wrote:
(2021-08-03, 23:38)schumi2004 Wrote: ....... online, but we have no episode guide. Check your tvshow.nfo and make sure the <episodeguide> tag is in place.
Need the full Debug Log which captures you scraping those shows.
But the message certainly suggests nfo files or the wrong episode guide URL in your database, especially as it works when you switch over to TVDB indicating you have a TVDB episodeguide URL.
Here you go: cizajimada.kodi (paste)


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - Karellen - 2021-08-04

(2021-08-04, 09:24)schumi2004 Wrote: Here you go: cizajimada.kodi (paste)
Thanks. I can see the issue.
Line 1735 of your log, you will see Undefined MySQL error: Code (1406). It is related to this problem... https://github.com/xbmc/xbmc/issues/15768

The new python scrapers are meant to overcome this issue by limiting the artwork, but with 15 seasons in Supernatural, it looks like the scraper needs a bit of tweaking for these larger shows.

Game of Thrones is known to suffer this problem. I have not come across The Walking Dead, but looking at the number of seasons, and the massive amount of artwork, it will be the same issue. Both these shows are not in your log.

Your choices are:
1. Implement the solution detailed in that Issue report. One method is here... https://github.com/xbmc/xbmc/issues/15768#issuecomment-564677726
2. This is an issue that @pkscout will need to look at to tweak the artwork scraping, so you could wait for a fix, but I have no idea how quickly that will happen


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - schumi2004 - 2021-08-04

(2021-08-04, 10:19)Karellen Wrote:
(2021-08-04, 09:24)schumi2004 Wrote: Here you go: cizajimada.kodi (paste)
Thanks. I can see the issue.
Line 1735 of your log, you will see Undefined MySQL error: Code (1406). It is related to this problem... https://github.com/xbmc/xbmc/issues/15768

The new python scrapers are meant to overcome this issue by limiting the artwork, but with 15 seasons in Supernatural, it looks like the scraper needs a bit of tweaking for these larger shows.

Game of Thrones is known to suffer this problem. I have not come across The Walking Dead, but looking at the number of seasons, and the massive amount of artwork, it will be the same issue. Both these shows are not in your log.

Your choices are:
1. Implement the solution detailed in that Issue report. One method is here... https://github.com/xbmc/xbmc/issues/15768#issuecomment-564677726
2. This is an issue that @pkscout will need to look at to tweak the artwork scraping, so you could wait for a fix, but I have no idea how quickly that will happen

Hi, thanks for your swift reply.

You're correct that those other shows are not in the log, I thought if 1 show has it then others will probably be the same.
I checked the amount of seasons etc and they are indeed pretty large:

Supernatural would be the largest, with 15 seasons and approximately 20 eps each.
GOT with 8 seasons and 10 eps each except for s07 with 7eps and s08 with 6eps (at least in my archive)
The Walking Dead with 10 seasons and also almost 20 eps each generates indeed a lot of artwork.

Thanks for the feedback and workarounds Wink


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - pkscout - 2021-08-05

(2021-08-04, 12:53)schumi2004 Wrote:
(2021-08-04, 10:19)Karellen Wrote:
(2021-08-04, 09:24)schumi2004 Wrote: Here you go: cizajimada.kodi (paste)
Thanks. I can see the issue.
Line 1735 of your log, you will see Undefined MySQL error: Code (1406). It is related to this problem... https://github.com/xbmc/xbmc/issues/15768

The new python scrapers are meant to overcome this issue by limiting the artwork, but with 15 seasons in Supernatural, it looks like the scraper needs a bit of tweaking for these larger shows.

Game of Thrones is known to suffer this problem. I have not come across The Walking Dead, but looking at the number of seasons, and the massive amount of artwork, it will be the same issue. Both these shows are not in your log.

Your choices are:
1. Implement the solution detailed in that Issue report. One method is here... https://github.com/xbmc/xbmc/issues/15768#issuecomment-564677726
2. This is an issue that @pkscout will need to look at to tweak the artwork scraping, so you could wait for a fix, but I have no idea how quickly that will happen

Hi, thanks for your swift reply.

You're correct that those other shows are not in the log, I thought if 1 show has it then others will probably be the same.
I checked the amount of seasons etc and they are indeed pretty large:

Supernatural would be the largest, with 15 seasons and approximately 20 eps each.
GOT with 8 seasons and 10 eps each except for s07 with 7eps and s08 with 6eps (at least in my archive)
The Walking Dead with 10 seasons and also almost 20 eps each generates indeed a lot of artwork.

Thanks for the feedback and workarounds Wink
Well, that's bizarre.  We tested and tested and tested this to make sure the scraper never generated anything close to 65K characters.  But Supernatural clocks in at 80K. Looking at the text that's being saved, I think there might have been a small change with 19.1 to how the image data are stored that means each image creates a larger blob of text. I can't be 100% sure, but the image references definitely look different to me.  Anyway, I reduced the max images from 350 to 250, and Supernatural clocked in at 56K characters.  I've done the pull request for the update, so once someone else on the team reviews it, the update will go out.  Hopefully not more than a few days.


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - littlejeem - 2021-08-07

(2021-08-03, 01:59)pkscout Wrote:
(2021-08-02, 16:39)littlejeem Wrote: Hello Folks

Im looking for some help trying to nail down an issue i'm having getting TMDB (TV Shows) to scrape a post for a TV Show.

The show is: 71365-battlestar-galactica, now I know this show is a bit of a pitta and I've checked out the link here that really helped with the naming. I am specifically looking at the 2003 Series (essentially a mini-series with two files) scraped by TMDB.

I've attached the debug log here. This covers the period after i've removed the series from the the library and rebooted, basically library update.

Any help much appreciated

Sys Info
Distributor ID: Ubuntu
Description: Ubuntu 18.04.5 LTS
Release: 18.04
Codename: bionic
Kernel is: 4.15.0-151-generic
Kodi (19.1 (19.1.0) Git:20210509-85e05228b4)

littlejeem

That log shows you using the old (but still installed) XML based scraper, not the Python based one.  If you want to use the new scraper, you need to update your sources.  If you upgraded from Kodi 18, Kodi 19 maintained your scraper choices, so even though TMDb TV Shows is the default scraper, it doesn't replace your old choices.  Also, it looks like you haven't updated your addons in awhile. The Python scraper version shows 1.4.0, but 1.4.6 is the current one, so you should update before making any other changes.
Hi @pkscout that worked spot on. Many thanks for the assist