Kodi Community Forum
Release TMDb TV Show scraper (Python - Default Matrix Scraper) - Printable Version

+- Kodi Community Forum (https://forum.kodi.tv)
+-- Forum: Support (https://forum.kodi.tv/forumdisplay.php?fid=33)
+--- Forum: Add-on Support (https://forum.kodi.tv/forumdisplay.php?fid=27)
+---- Forum: Information Providers (scrapers) (https://forum.kodi.tv/forumdisplay.php?fid=147)
+----- Forum: TV Show Scrapers (https://forum.kodi.tv/forumdisplay.php?fid=305)
+----- Thread: Release TMDb TV Show scraper (Python - Default Matrix Scraper) (/showthread.php?tid=357232)



RE: TheMovieDB Python - TV Show scraper - pkscout - 2020-10-27

(2020-10-27, 19:18)ramis52 Wrote: Hi,

maybe this is the wrong section but i have a question how the scraper and kodi matrix handles the images. I realized that when i use matrix + the new python scrapers, the storage of my fire tv cube is getting smaller and smaller. I used the following settings in the scraper and kodi matrix artwork section

scraper: clearart, clearlogo, poster
matrix: custom, clearart, clearlogo

after scraping law and order svu (20 Seasons), Blue Bloods (10 Seasons) the thumbnail folder was already 700mb big. Did i misss any setting? This doesnt happen on leia.

regards
r
First, you might want to check and make sure you have the most current version of the scraper (v.1.3.0). That version eliminates all the art settings in favor of using the art settings from Matrix core.  That won't affect the rest of the answer I'm about to give, but since you mentioned two sets of settings, I thought I'd mention it.

That image cache folder isn't just for scrapes, it's for all artwork, so the size of the cache is dependent on lots of things including TV shows, movies, skin images, other addons, etc.  And once something is cached, it stays for quite awhile, so unless you started with an absolutely clean install of Matrix (including deleting all your settings), and then scraped ONLY those two shows, it's hard to say that the 700mb is all from those two shows.

Having said all that, here's how the caching works.  The scraper provides Kodi urls to the various art, and then Kodi caches the default ones for each image type at the time of the show scrape. For something like SVU given your settings, that means 1 show poster, 1 show fanart (I'm pretty sure you can't disable fanart without disabling all art, but I'm still learning the new Matrix behavior), 1 show clearart, 1 show clearlogo, and then 20 season posters.  If you open the art selector, Kodi also caches a small thumbnail of every available image for that section and then caches the larger version of whatever you select. So if you are changing art alot trying to find the one you like, I could see how you'd get to 700mb even with a couple shows with lots of seasons, but you'd have to be changing a fair amount of art.


RE: TheMovieDB Python - TV Show scraper - ramis52 - 2020-10-30

(2020-10-27, 21:23)pkscout Wrote:
(2020-10-27, 19:18)ramis52 Wrote: Hi,

maybe this is the wrong section but i have a question how the scraper and kodi matrix handles the images. I realized that when i use matrix + the new python scrapers, the storage of my fire tv cube is getting smaller and smaller. I used the following settings in the scraper and kodi matrix artwork section

scraper: clearart, clearlogo, poster
matrix: custom, clearart, clearlogo

after scraping law and order svu (20 Seasons), Blue Bloods (10 Seasons) the thumbnail folder was already 700mb big. Did i misss any setting? This doesnt happen on leia.

regards
r
First, you might want to check and make sure you have the most current version of the scraper (v.1.3.0). That version eliminates all the art settings in favor of using the art settings from Matrix core.  That won't affect the rest of the answer I'm about to give, but since you mentioned two sets of settings, I thought I'd mention it.

That image cache folder isn't just for scrapes, it's for all artwork, so the size of the cache is dependent on lots of things including TV shows, movies, skin images, other addons, etc.  And once something is cached, it stays for quite awhile, so unless you started with an absolutely clean install of Matrix (including deleting all your settings), and then scraped ONLY those two shows, it's hard to say that the 700mb is all from those two shows.

Having said all that, here's how the caching works.  The scraper provides Kodi urls to the various art, and then Kodi caches the default ones for each image type at the time of the show scrape. For something like SVU given your settings, that means 1 show poster, 1 show fanart (I'm pretty sure you can't disable fanart without disabling all art, but I'm still learning the new Matrix behavior), 1 show clearart, 1 show clearlogo, and then 20 season posters.  If you open the art selector, Kodi also caches a small thumbnail of every available image for that section and then caches the larger version of whatever you select. So if you are changing art alot trying to find the one you like, I could see how you'd get to 700mb even with a couple shows with lots of seasons, but you'd have to be changing a fair amount of art.
Hi,

thanks for comment and the explanation of the cachingSmile

i startet with a clean install on a firetv stick few minutes ago. I used the newest scraper and the newest matrix nightly version. No Skin or other addons installed, besides this scraper. This time i have set it to use fanart.tv download to true, and use the basic Settings in Media for Artwork.

After scraping the 2 mentioned Series i checked the thumbnail folder and it was again that big. Then i checked some images inside and realized that many actor pictures have a size of ~400kb. The same Pictures in Leia have a size of ~16kb.

Maybe there is something wrong how Matrix or the scraper handles the setting "Download Actor thumbnails" in the Media Settings?


RE: TheMovieDB Python - TV Show scraper - pkscout - 2020-10-30

(2020-10-30, 19:45)ramis52 Wrote: thanks for comment and the explanation of the cachingSmile

i startet with a clean install on a firetv stick few minutes ago. I used the newest scraper and the newest matrix nightly version. No Skin or other addons installed, besides this scraper. This time i have set it to use fanart.tv download to true, and use the basic Settings in Media for Artwork.

After scraping the 2 mentioned Series i checked the thumbnail folder and it was again that big. Then i checked some images inside and realized that many actor pictures have a size of ~400kb. The same Pictures in Leia have a size of ~16kb.

Maybe there is something wrong how Matrix or the scraper handles the setting "Download Actor thumbnails" in the Media Settings?
I looked through some of the actor images on TMDb for those two shows, and at least a few of them are very large (like 1200 x 1800).  The new python scraper provides a URL for the original image, so when large images like that are uploaded to TMDb they are then cached that large by Kodi.  I suspect if I just change the scraper to provide a URL for a smaller image that someone will complain about the quality of the images.  I may look at adding a setting to provide "smaller" actor images for people with space constrained devices, but I need to figure out whether to just provide one smaller image option or something like "small, medium, large, original."  I'm guessing the older scraper on Leia provides a URL for a smaller image, which is why you're seeing the difference.  And at 16kb, it is a *really* small image.


RE: TheMovieDB Python - TV Show scraper - ramis52 - 2020-10-31

(2020-10-30, 21:16)pkscout Wrote:
(2020-10-30, 19:45)ramis52 Wrote: thanks for comment and the explanation of the cachingSmile

i startet with a clean install on a firetv stick few minutes ago. I used the newest scraper and the newest matrix nightly version. No Skin or other addons installed, besides this scraper. This time i have set it to use fanart.tv download to true, and use the basic Settings in Media for Artwork.

After scraping the 2 mentioned Series i checked the thumbnail folder and it was again that big. Then i checked some images inside and realized that many actor pictures have a size of ~400kb. The same Pictures in Leia have a size of ~16kb.

Maybe there is something wrong how Matrix or the scraper handles the setting "Download Actor thumbnails" in the Media Settings?
I looked through some of the actor images on TMDb for those two shows, and at least a few of them are very large (like 1200 x 1800).  The new python scraper provides a URL for the original image, so when large images like that are uploaded to TMDb they are then cached that large by Kodi.  I suspect if I just change the scraper to provide a URL for a smaller image that someone will complain about the quality of the images.  I may look at adding a setting to provide "smaller" actor images for people with space constrained devices, but I need to figure out whether to just provide one smaller image option or something like "small, medium, large, original."  I'm guessing the older scraper on Leia provides a URL for a smaller image, which is why you're seeing the difference.  And at 16kb, it is a *really* small image.
Hi,

thanks for considering about an option to add the "small,medium, large". I switched over to path substitution and using my pihole device for the the thumbnails. now i have plenty of space availableWink

another thing which i think should be looked at is the Poster download. Alot of Posters are in russian language after the initial scrape (1000+ series, ~120 have russian posters). I have no problem changing them, but maybe there is an option that the prefered language is the first download, then the english fallback. Another thing with posters is, the scraper adds posters without title by default i think. Correct me if iam wrong. I like posters with show titles alot more. Maybe this can be an option too in Settings of the scraper.

thanks alot for your good work in the tv and movie scraper.


RE: TheMovieDB Python - TV Show scraper - pkscout - 2020-11-01

(2020-10-31, 22:03)ramis52 Wrote:
(2020-10-30, 21:16)pkscout Wrote:
(2020-10-30, 19:45)ramis52 Wrote: thanks for comment and the explanation of the cachingSmile

i startet with a clean install on a firetv stick few minutes ago. I used the newest scraper and the newest matrix nightly version. No Skin or other addons installed, besides this scraper. This time i have set it to use fanart.tv download to true, and use the basic Settings in Media for Artwork.

After scraping the 2 mentioned Series i checked the thumbnail folder and it was again that big. Then i checked some images inside and realized that many actor pictures have a size of ~400kb. The same Pictures in Leia have a size of ~16kb.

Maybe there is something wrong how Matrix or the scraper handles the setting "Download Actor thumbnails" in the Media Settings?
I looked through some of the actor images on TMDb for those two shows, and at least a few of them are very large (like 1200 x 1800).  The new python scraper provides a URL for the original image, so when large images like that are uploaded to TMDb they are then cached that large by Kodi.  I suspect if I just change the scraper to provide a URL for a smaller image that someone will complain about the quality of the images.  I may look at adding a setting to provide "smaller" actor images for people with space constrained devices, but I need to figure out whether to just provide one smaller image option or something like "small, medium, large, original."  I'm guessing the older scraper on Leia provides a URL for a smaller image, which is why you're seeing the difference.  And at 16kb, it is a *really* small image.
Hi,

thanks for considering about an option to add the "small,medium, large". I switched over to path substitution and using my pihole device for the the thumbnails. now i have plenty of space availableWink

another thing which i think should be looked at is the Poster download. Alot of Posters are in russian language after the initial scrape (1000+ series, ~120 have russian posters). I have no problem changing them, but maybe there is an option that the prefered language is the first download, then the english fallback. Another thing with posters is, the scraper adds posters without title by default i think. Correct me if iam wrong. I like posters with show titles alot more. Maybe this can be an option too in Settings of the scraper.

thanks alot for your good work in the tv and movie scraper..
Here's the logic for posters.  If the primary poster on TMDb has no language, it is listed as the default poster.  If not, then the first poster that matches your language is used as the default. Otherwise the first English poster is used as the default.  Because this is the default scraper, we try and keep the options and settings to a minimum, and since you can easily change posters, I'm probably not going to add any options for changing that logic.


RE: TheMovieDB Python - TV Show scraper - roidy - 2020-11-14

@pkscout TheMovieDB Python TV Show scraper seems to be broken at the moment. Lots of errors as shown in the log and only about a quarter of my library seems to scrape.

The original XML scraper has also completely stopped working so I'm guessing this is some problem/change with TMDB's api.

https://pastebin.com/PZdFygjw


RE: TheMovieDB Python - TV Show scraper - Hebotsuki - 2020-11-14

It's not working for me either. According to Travis from TheMoviedb, some new fileds have been added to the credits in order to a future support for translated person names. That's probably what broke the scraper. I guess the scraper developpers are already aware of the problem.


RE: TheMovieDB Python - TV Show scraper - roidy - 2020-11-14

@pkscout Just grabbed your changes from github and can confirm that it fixes the issue. Thanks.


RE: TheMovieDB Python - TV Show scraper - Karellen - 2020-11-14

@roidy

Do you mind providing another log with the update that captures you scraping a tv show. Thanks.


RE: TheMovieDB Python - TV Show scraper - pkscout - 2020-11-15

(2020-11-14, 20:58)Hebotsuki Wrote: It's not working for me either. According to Travis from TheMoviedb, some new fileds have been added to the credits in order to a future support for translated person names. That's probably what broke the scraper. I guess the scraper developpers are already aware of the problem.
Just FYI that there is an update available from the repo now for this. If you don't get it automatically, you can force check for updates, and that should download it.


RE: TheMovieDB Python - TV Show scraper - MeC!as - 2020-11-15

I just tried the "TMDb TV Shows" scraper 1.1.18 and there is nothing be scraped at all. If I go to the files (within kodi) and just try that episode to be scraped to the database a error message pops up saying "No information found!". That is the same behaviour as with "The Movie Database" scraper 3.5.5. I believe as well it has something to do with API changes and maybe the lately changes to the "TMDB common scraper"  v3.2.x


RE: TheMovieDB Python - TV Show scraper - Karellen - 2020-11-15

(2020-11-15, 08:52)MeC!as Wrote: I just tried the "TMDb TV Shows" scraper 1.1.18 and there is nothing be scraped at all.
All seems to be working ok here.

(2020-11-15, 08:52)MeC!as Wrote: If I go to the files (within kodi) and just try that episode to be scraped to the database a error message pops up saying "No information found!".
That might be your problem. You don't scrape an episode, you scrape the tv show, which then finds the episodes and scrapes them.


RE: TheMovieDB Python - TV Show scraper - MeC!as - 2020-11-15

Thanks you are right. The Python scraper works, I apparently used an example with umlauts (mysteriöse Mordfälle). This one did not work. I renamed all files now to English (preferred language in the scraper settings German), it scrapes now the English titles. So the scraper ignores the German titles - since it cannot read it - but the fallback works fine. I can live with it for now.

EDIT:
The episode titles are in German, the issue with the umlauts is apparently just with the TV show title.


RE: TheMovieDB Python - TV Show scraper - Karellen - 2020-11-15

(2020-11-15, 09:24)MeC!as Wrote: The Python scraper works, I apparently used an example with umlauts (mysteriöse Mordfälle). This one did not work.
Thanks. I have tried and it errors for me also. I thought we fixed this.

@pkscout anything can be done about this? Line 1254...https://paste.kodi.tv/ususoyanor.kodi


RE: TheMovieDB Python - TV Show scraper - roidy - 2020-11-15

@pkscout Seems I'm now getting a different error on some TV Shows

Line 1371 - https://paste.kodi.tv/uwocupacat.kodi