Kodi Community Forum
Release TMDb TV Show scraper (Python - Default Matrix Scraper) - Printable Version

+- Kodi Community Forum (https://forum.kodi.tv)
+-- Forum: Support (https://forum.kodi.tv/forumdisplay.php?fid=33)
+--- Forum: Add-on Support (https://forum.kodi.tv/forumdisplay.php?fid=27)
+---- Forum: Information Providers (scrapers) (https://forum.kodi.tv/forumdisplay.php?fid=147)
+----- Forum: TV Show Scrapers (https://forum.kodi.tv/forumdisplay.php?fid=305)
+----- Thread: Release TMDb TV Show scraper (Python - Default Matrix Scraper) (/showthread.php?tid=357232)

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - Jondar - 2021-03-26

Hi!

I'm almost finished in my process of migrating from TVDB to TMDB TV Show Python.

I'm having problems with the show "Bleach", using the episode_group URL

https://www.themoviedb.org/tv/30984-bleach/episode_group/5b54a34dc3a3680b6803c43d

That above URL is the only line in a tvshow.nfo file, for parsing. All video files are named according to that particular listing.

Kodi popups an "Error: check the log..." message, and continues on, finding the show but scans the first "season", ignoring other files

A check of the debug log indicates an "*** Unhandled exception detected: <class 'IndexError'> list index out of range ***" error

Debug log, that I hope is helpful: https://pastebin.pl/view/851f75ac

I'm running Kodi 19.0 on Windows 10, 20H2. Using 1.4.2 of the scraper

Thanks in advance,

~Jondar


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - pkscout - 2021-03-26

(2021-03-26, 14:12)Jondar Wrote: Hi!

I'm almost finished in my process of migrating from TVDB to TMDB TV Show Python.

I'm having problems with the show "Bleach", using the episode_group URL

https://www.themoviedb.org/tv/30984-bleach/episode_group/5b54a34dc3a3680b6803c43d

That above URL is the only line in a tvshow.nfo file, for parsing. All video files are named according to that particular listing.

Kodi popups an "Error: check the log..." message, and continues on, finding the show but scans the first "season", ignoring other files

A check of the debug log indicates an "*** Unhandled exception detected: <class 'IndexError'> list index out of range ***" error

Debug log, that I hope is helpful: https://pastebin.pl/view/851f75ac

I'm running Kodi 19.0 on Windows 10, 20H2. Using 1.4.2 of the scraper

Thanks in advance,

~Jondar

Thanks for the detailed report and debug log.  It looks like that episode group has a season with no episodes in it (season 16), and the scraper doesn't handle seasons with no episodes.  I have corrected the code to account for that and was able to scrape all 15 seasons of Bleach on my test setup.  I will push an update to the main Kodi repo today.  It generally takes a couple days to get changes approved and pushed out. Unless you turned auto updating off, you should get the update automatically (it'll be version 1.4.3).


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - Jondar - 2021-03-27

(2021-03-26, 19:10)pkscout Wrote: Thanks for the detailed report and debug log.  It looks like that episode group has a season with no episodes in it (season 16), and the scraper doesn't handle seasons with no episodes.  I have corrected the code to account for that and was able to scrape all 15 seasons of Bleach on my test setup.  I will push an update to the main Kodi repo today.  It generally takes a couple days to get changes approved and pushed out. Unless you turned auto updating off, you should get the update automatically (it'll be version 1.4.3).

Thanks for the quick fix and update! I'm also glad that my report and log were helpful!

About an hour after I'd posted, I looked at the log again, and had a suspicion that the empty "season" was the culprit. As a test, I'd used another episode group for Bleach (that was almost identical) and it worked. So, I'm glad that my suspicion was correct.

Thanks again.

~Jondar


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - aitte - 2021-03-27

Thanks for updating this plug in and modernizing it.

Something has become very broken in its language handling though. It worked in Kodi 18.broken in Kodi 19.

I spent an hour trying every combination of settings (keep original titles on/off, and certification country, mainly) and trying both "The Movie Database" and "TMDb TV Shows" scrapers in Kodi 19 but it won't behave properly.

Between each test, I deleted the TV Shows source and wiped my library to get a fresh retry.

I have set the scraper to preferred language: sv-SE. (Swedish)

It doesn't download episode titles, summaries or fanart anymore. It only downloads the show poster and main show title.

Strangely enough each episode gets a generic Swedish title like "Avsnitt 3" (Episode 3) but not the actual sv-SE titles from the website.

Without further ado, here is the show I am trying. The proper episode titles are things like Japan and Hawaii. And there is supposed to be art for each episode (the scraper had that in Kodi 18).

https://www.themoviedb.org/tv/47463-en-stark-resa-med-morgan-ola-conny/season/6/episode/1/changes

So yeah all of this worked in previous Kodi 18. Something broke after that.


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - pkscout - 2021-03-27

(2021-03-27, 16:37)aitte Wrote: Thanks for updating this plug in and modernizing it.

Something has become very broken in its language handling though. It worked in Kodi 18.broken in Kodi 19.

I spent an hour trying every combination of settings (keep original titles on/off, and certification country, mainly) and trying both "The Movie Database" and "TMDb TV Shows" scrapers in Kodi 19 but it won't behave properly.

Between each test, I deleted the TV Shows source and wiped my library to get a fresh retry.

I have set the scraper to preferred language: sv-SE. (Swedish)

It doesn't download episode titles, summaries or fanart anymore. It only downloads the show poster and main show title.

Strangely enough each episode gets a generic Swedish title like "Avsnitt 3" (Episode 3) but not the actual sv-SE titles from the website.

Without further ado, here is the show I am trying. The proper episode titles are things like Japan and Hawaii. And there is supposed to be art for each episode (the scraper had that in Kodi 18).

https://www.themoviedb.org/tv/47463-en-stark-resa-med-morgan-ola-conny/season/6/episode/1/changes

So yeah all of this worked in previous Kodi 18. Something broke after that.
The TMDb folks have been changing the way to API returns language results, and with the state of things right now there is nothing I can reasonably do to make things work.  There is a todo item on their project list to send back fallback language information when you request a show (like the website does), but there is no timetable for delivering that.  I'm sorry that there's not better information than that.  This also affects Kodi 18, so going back won't fix anything for you.

As some background, the API used to return empty fields if the language selected didn't have a translation. That made it pretty easy to tell what needed fallback so that you could grab the English version and fill in the gaps. Then they changed it so that a few things (like episode names) got generic episode names in English (like Episode 1, etc).  OK, not great, but you could still kind of tell that something needed fallback if it had a generic title starting with "Episode." Then they changed it so that the generic information is in the selected language.  And that's where things broke.  There is no reasonable way to determine if something is a valid title or not without having the correct translation for every single generic entry in every single language to compare against.  So we wait.


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - aitte - 2021-03-27

(2021-03-27, 18:44)pkscout Wrote:
(2021-03-27, 16:37)aitte Wrote: Thanks for updating this plug in and modernizing it.

Something has become very broken in its language handling though. It worked in Kodi 18.broken in Kodi 19.

I spent an hour trying every combination of settings (keep original titles on/off, and certification country, mainly) and trying both "The Movie Database" and "TMDb TV Shows" scrapers in Kodi 19 but it won't behave properly.

Between each test, I deleted the TV Shows source and wiped my library to get a fresh retry.

I have set the scraper to preferred language: sv-SE. (Swedish)

It doesn't download episode titles, summaries or fanart anymore. It only downloads the show poster and main show title.

Strangely enough each episode gets a generic Swedish title like "Avsnitt 3" (Episode 3) but not the actual sv-SE titles from the website.

Without further ado, here is the show I am trying. The proper episode titles are things like Japan and Hawaii. And there is supposed to be art for each episode (the scraper had that in Kodi 18).

https://www.themoviedb.org/tv/47463-en-stark-resa-med-morgan-ola-conny/season/6/episode/1/changes

So yeah all of this worked in previous Kodi 18. Something broke after that.
The TMDb folks have been changing the way to API returns language results, and with the state of things right now there is nothing I can reasonably do to make things work.  There is a todo item on their project list to send back fallback language information when you request a show (like the website does), but there is no timetable for delivering that.  I'm sorry that there's not better information than that.  This also affects Kodi 18, so going back won't fix anything for you.

As some background, the API used to return empty fields if the language selected didn't have a translation. That made it pretty easy to tell what needed fallback so that you could grab the English version and fill in the gaps. Then they changed it so that a few things (like episode names) got generic episode names in English (like Episode 1, etc).  OK, not great, but you could still kind of tell that something needed fallback if it had a generic title starting with "Episode." Then they changed it so that the generic information is in the selected language.  And that's where things broke.  There is no reasonable way to determine if something is a valid title or not without having the correct translation for every single generic entry in every single language to compare against.  So we wait.

Hello, thank you so much for the detailed explanation. That makes sense and is an unfortunate situation. I can understand how it would be nearly impossible to detect blank Metadata and the need for a query in English. I hope they fix their API soon. I found the tracker item for this lack of fallback in API queries:

https://trello.com/c/UV1IGYfN/3-add-support-to-fallback-translation-queries

I wonder one thing though. My show example HAS sv-SE metadata because it is a Swedish show. But the scraper doesn't download that even though it is set to prefer sv-SE. It just gets generic Swedish titles. Not the actual Swedish metadata stored on this site:

https://www.themoviedb.org/tv/47463-en-stark-resa-med-morgan-ola-conny/season/6/episode/1/changes

I almost wonder if it is some other issue too, such as the plugin asking for the wrong language tag when asking for the Swedish metadata?

It is bizarre to me why TMDb returns generic Swedish titles and no Metadata even though the site literally contains proper titles and Metadata in Swedish. Seems like an API error too.


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - pkscout - 2021-03-28

(2021-03-27, 19:02)aitte Wrote:
(2021-03-27, 18:44)pkscout Wrote:
(2021-03-27, 16:37)aitte Wrote: Thanks for updating this plug in and modernizing it.

Something has become very broken in its language handling though. It worked in Kodi 18.broken in Kodi 19.

I spent an hour trying every combination of settings (keep original titles on/off, and certification country, mainly) and trying both "The Movie Database" and "TMDb TV Shows" scrapers in Kodi 19 but it won't behave properly.

Between each test, I deleted the TV Shows source and wiped my library to get a fresh retry.

I have set the scraper to preferred language: sv-SE. (Swedish)

It doesn't download episode titles, summaries or fanart anymore. It only downloads the show poster and main show title.

Strangely enough each episode gets a generic Swedish title like "Avsnitt 3" (Episode 3) but not the actual sv-SE titles from the website.

Without further ado, here is the show I am trying. The proper episode titles are things like Japan and Hawaii. And there is supposed to be art for each episode (the scraper had that in Kodi 18).

https://www.themoviedb.org/tv/47463-en-stark-resa-med-morgan-ola-conny/season/6/episode/1/changes

So yeah all of this worked in previous Kodi 18. Something broke after that.
The TMDb folks have been changing the way to API returns language results, and with the state of things right now there is nothing I can reasonably do to make things work.  There is a todo item on their project list to send back fallback language information when you request a show (like the website does), but there is no timetable for delivering that.  I'm sorry that there's not better information than that.  This also affects Kodi 18, so going back won't fix anything for you.

As some background, the API used to return empty fields if the language selected didn't have a translation. That made it pretty easy to tell what needed fallback so that you could grab the English version and fill in the gaps. Then they changed it so that a few things (like episode names) got generic episode names in English (like Episode 1, etc).  OK, not great, but you could still kind of tell that something needed fallback if it had a generic title starting with "Episode." Then they changed it so that the generic information is in the selected language.  And that's where things broke.  There is no reasonable way to determine if something is a valid title or not without having the correct translation for every single generic entry in every single language to compare against.  So we wait.

Hello, thank you so much for the detailed explanation. That makes sense and is an unfortunate situation. I can understand how it would be nearly impossible to detect blank Metadata and the need for a query in English. I hope they fix their API soon. I found the tracker item for this lack of fallback in API queries:

https://trello.com/c/UV1IGYfN/3-add-support-to-fallback-translation-queries

I wonder one thing though. My show example HAS sv-SE metadata because it is a Swedish show. But the scraper doesn't download that even though it is set to prefer sv-SE. It just gets generic Swedish titles. Not the actual Swedish metadata stored on this site:

https://www.themoviedb.org/tv/47463-en-stark-resa-med-morgan-ola-conny/season/6/episode/1/changes

I almost wonder if it is some other issue too, such as the plugin asking for the wrong language tag when asking for the Swedish metadata?

It is bizarre to me why TMDb returns generic Swedish titles and no Metadata even though the site literally contains proper titles and Metadata in Swedish. Seems like an API error too.

Well, this was a bizarre one.  Apparently that show has two entries. On the web site, both look like they are populated, but when one of them is called via the API, there is very little metadata.  And of course the one the API returned first is the one without the metadata.  I think if you add a tvshow.nfo file to the directory and point it to:

https://www.themoviedb.org/tv/47463-en-stark-resa-med-morgan-ola-conny

You will find that you get better data.  Without that, the API returns this version of the show:

https://www.themoviedb.org/tv/114824

As I said, the second one one looks fine on the web site but has little API data.  The alternative is to refresh the data for the show (including all episodes) and then select the second entry when prompted for a show.


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - aitte - 2021-03-28

(2021-03-28, 05:55)pkscout Wrote:
(2021-03-27, 19:02)aitte Wrote:
(2021-03-27, 18:44)pkscout Wrote: The TMDb folks have been changing the way to API returns language results, and with the state of things right now there is nothing I can reasonably do to make things work.  There is a todo item on their project list to send back fallback language information when you request a show (like the website does), but there is no timetable for delivering that.  I'm sorry that there's not better information than that.  This also affects Kodi 18, so going back won't fix anything for you.

As some background, the API used to return empty fields if the language selected didn't have a translation. That made it pretty easy to tell what needed fallback so that you could grab the English version and fill in the gaps. Then they changed it so that a few things (like episode names) got generic episode names in English (like Episode 1, etc).  OK, not great, but you could still kind of tell that something needed fallback if it had a generic title starting with "Episode." Then they changed it so that the generic information is in the selected language.  And that's where things broke.  There is no reasonable way to determine if something is a valid title or not without having the correct translation for every single generic entry in every single language to compare against.  So we wait.

Hello, thank you so much for the detailed explanation. That makes sense and is an unfortunate situation. I can understand how it would be nearly impossible to detect blank Metadata and the need for a query in English. I hope they fix their API soon. I found the tracker item for this lack of fallback in API queries:

https://trello.com/c/UV1IGYfN/3-add-support-to-fallback-translation-queries

I wonder one thing though. My show example HAS sv-SE metadata because it is a Swedish show. But the scraper doesn't download that even though it is set to prefer sv-SE. It just gets generic Swedish titles. Not the actual Swedish metadata stored on this site:

https://www.themoviedb.org/tv/47463-en-stark-resa-med-morgan-ola-conny/season/6/episode/1/changes

I almost wonder if it is some other issue too, such as the plugin asking for the wrong language tag when asking for the Swedish metadata?

It is bizarre to me why TMDb returns generic Swedish titles and no Metadata even though the site literally contains proper titles and Metadata in Swedish. Seems like an API error too.

Well, this was a bizarre one.  Apparently that show has two entries. On the web site, both look like they are populated, but when one of them is called via the API, there is very little metadata.  And of course the one the API returned first is the one without the metadata.  I think if you add a tvshow.nfo file to the directory and point it to:

https://www.themoviedb.org/tv/47463-en-stark-resa-med-morgan-ola-conny

You will find that you get better data.  Without that, the API returns this version of the show:

https://www.themoviedb.org/tv/114824

As I said, the second one one looks fine on the web site but has little API data.  The alternative is to refresh the data for the show (including all episodes) and then select the second entry when prompted for a show.

OH thank you so much for solving that mystery and the workaround! I see that some toddler created an empty duplicate of the show on December 18 2020. Will have to talk to tvdb admins about that. Smile


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - aitte - 2021-03-28

In fact looking at the changes, it looks like a total fool has found a fully translated Norwegian entry for the show, hijacked it, renamed it all back to Swedish with empty descriptions. Oh dear. Definitely gonna talk to the admin there about this. 😂

https://www.themoviedb.org/tv/114824-ut-pa-tur-med-morgan-og-ola-conny/changes


RE: TMDB TV Show scraper - roidy - 2021-04-05

(2021-03-16, 13:17)roidy Wrote: https://github.com/xbmc/xbmc/blob/e99b7ddc3681e35022e2d1510f373342d34d5e79/addons/metadata.tvshows.themoviedb.org.python/libs/data_utils.py#L239-L241

Any chance we could we get a clean Studio value without the country added.

If a skin wants to display the country as well then it should really be up to the skin to add it, thanks.

@pkscout Sorry for bringing this issue up again but I just ran into a secondary problem with you adding the country to the end of the studio, resource studio image packs no longer work as they all rely on the studio name being clean with no country code.

I know I added the option to remove the country in a pull request but maybe the default should be no country added. Otherwise anybody using this scrapper with the default settings will no longer get studio icons.


RE: TMDB TV Show scraper - pkscout - 2021-04-05

(2021-04-05, 15:25)roidy Wrote:
(2021-03-16, 13:17)roidy Wrote: https://github.com/xbmc/xbmc/blob/e99b7ddc3681e35022e2d1510f373342d34d5e79/addons/metadata.tvshows.themoviedb.org.python/libs/data_utils.py#L239-L241

Any chance we could we get a clean Studio value without the country added.

If a skin wants to display the country as well then it should really be up to the skin to add it, thanks.

@pkscout Sorry for bringing this issue up again but I just ran into a secondary problem with you adding the country to the end of the studio, resource studio image packs no longer work as they all rely on the studio name being clean with no country code.

I know I added the option to remove the country in a pull request but maybe the default should be no country added. Otherwise anybody using this scrapper with the default settings will no longer get studio icons.
Since it appears there is less of a standard way of doing this than I thought, I don't really have an opinion on this. I'd like the combination to still be an option, but if you'd like to submit a PR so that separate country and studio codes is the default, I'm fine with that.  And at this point you only need to do Matrix. I'm not back porting anything to Leia unless it's to fix a major problem.


RE: TMDB TV Show scraper - roidy - 2021-04-05

(2021-04-05, 22:41)pkscout Wrote:
(2021-04-05, 15:25)roidy Wrote:
(2021-03-16, 13:17)roidy Wrote: https://github.com/xbmc/xbmc/blob/e99b7ddc3681e35022e2d1510f373342d34d5e79/addons/metadata.tvshows.themoviedb.org.python/libs/data_utils.py#L239-L241

Any chance we could we get a clean Studio value without the country added.

If a skin wants to display the country as well then it should really be up to the skin to add it, thanks.

@pkscout Sorry for bringing this issue up again but I just ran into a secondary problem with you adding the country to the end of the studio, resource studio image packs no longer work as they all rely on the studio name being clean with no country code.

I know I added the option to remove the country in a pull request but maybe the default should be no country added. Otherwise anybody using this scrapper with the default settings will no longer get studio icons.
Since it appears there is less of a standard way of doing this than I thought, I don't really have an opinion on this. I'd like the combination to still be an option, but if you'd like to submit a PR so that separate country and studio codes is the default, I'm fine with that.  And at this point you only need to do Matrix. I'm not back porting anything to Leia unless it's to fix a major problem.
Ok, done, thanks.

https://github.com/xbmc/metadata.tvshows.themoviedb.org.python/pull/36


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - roidy - 2021-04-07

@pkscout For some reason Kodi doesn't support the tagline c03 database entry for tv shows, so could we use the plotoutline entry for this instead please as at the moment plotoutline is just being set to the same value as plot.

Forget that, tv shows don't even support the plotoutline database field even though the scraper sets it.


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - roidy - 2021-04-07

@pkscout 

Quick question, is a shows status updated on a library update?


RE: TMDb TV Show scraper (Python - Default Matrix Scraper) - pkscout - 2021-04-07

(2021-04-07, 21:01)roidy Wrote: @pkscout 

Quick question, is a shows status updated on a library update?
I don't understand the question. What status are you talking about?