![]() |
Release Universal Movie Scraper - Printable Version +- Kodi Community Forum (https://forum.kodi.tv) +-- Forum: Support (https://forum.kodi.tv/forumdisplay.php?fid=33) +--- Forum: Add-on Support (https://forum.kodi.tv/forumdisplay.php?fid=27) +---- Forum: Information Providers (scrapers) (https://forum.kodi.tv/forumdisplay.php?fid=147) +----- Forum: Movie Scrapers (https://forum.kodi.tv/forumdisplay.php?fid=302) +----- Thread: Release Universal Movie Scraper (/showthread.php?tid=129821) Pages:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
|
RE: Universal Movie Scraper - olympia - 2021-06-15 (2021-06-14, 20:27)PatK Wrote: Left is universal, right is TMDB I definitely need to have a look at the country, but for the rest, I think it is a hit & miss depending the scraper landing on the old or the new layout. The previous scraper version scraped the old layout - it was a hit & miss, because if the scraper landed on the NEW, then it was failing, but if landing on the OLD, then it was ok. With the new scraper version it is the opposite - it still a hit & miss, because if the scraper lands the NEW layout then it is ok, but if on the OLD one, then it is failing. I switched over to the new layout, because there were some reports that with the previous scraper (supporting the old layout) there were 90% failings which was somewhat confirmed my tests yesterday. ...but situation might be different today... Sorry guys, I am not willing to support both layout, because that would be tremendous effort and also very difficult and time consuming to maintain. RE: Universal Movie Scraper - olympia - 2021-06-15 (2021-06-15, 07:53)T-LO Wrote:To be honest, I don't even understand this...(2021-06-14, 11:20)olympia Wrote: Can you all try scraper version v5.5.0 please and report back? You report a crash for a Kodi version for which this scraper cannot even be installed. UMS v5.5.0 is maintained in Leia scraper repo on purpose. RE: Universal Movie Scraper - T-LO - 2021-06-15 (2021-06-15, 10:19)olympia Wrote:Oh sorry, I got an automatic update yesterday and I thought it came from you.(2021-06-15, 07:53)T-LO Wrote:To be honest, I don't even understand this...(2021-06-14, 11:20)olympia Wrote: Can you all try scraper version v5.5.0 please and report back? I just checked and I am on 2.9.13. This is something else then, right? RE: Universal Movie Scraper - olympia - 2021-06-15 (2021-06-15, 14:49)T-LO Wrote: I just checked and I am on 2.9.13. Yes, 2.9.13 is the latest UMS version for Kodi Jarvis and it is EOL, thus won't get any fixes. RE: Universal Movie Scraper - olympia - 2021-06-15 I just pushed imdb.common scraper version v3.2.1 with the intention of supporting both old and new layout (I've got fed up with the testing - resulted in busy, hard to follow code, but should work). Let me know your results once you've got updated. RE: Universal Movie Scraper - Zippy79 - 2021-06-15 (2021-06-15, 16:58)olympia Wrote: I just pushed imdb.common scraper version v3.2.1 with the intention of supporting both old and new layout (I've got fed up with the testing - resulted in busy, hard to follow code, but should work).Thank you for the update. There is now some intermittent wierdness going on. The country of origin is sometimes scraped correctly and sometimes it is blank. When the country is scraped properly the genre is returning a repeated result e.g. "Action \ Adventure \ Action \ Adventure". When the country is not scraped then genre is correct. The original title field returns the Japanese title 100% of the time. RE: Universal Movie Scraper - olympia - 2021-06-15 Can someone help me with which skin is showing the country? RE: Universal Movie Scraper - olympia - 2021-06-15 (2021-06-15, 17:53)Zippy79 Wrote: The original title field returns the Japanese title 100% of the time. This I cannot reproduce. RE: Universal Movie Scraper - Zippy79 - 2021-06-16 (2021-06-15, 22:06)olympia Wrote: Can someone help me with which skin is showing the country? I'm not sure I follow. I'm just using the default Estuary skin, you can see the country on the movie information page. If you keep refreshing the movie then sometimes it will switch from scraped to not scraped and vice versa (it can take a lot of refreshes to change but it will eventually). I also verify whether it has been scraped by looking at column c21 in the movies table. EDIT: I've figured out why the country of origin is sometimes not scraped. This is a snippet of the HTML from the new IMDB site:
and this is a snippet of the HTML from the old IMDB site:
Note the lack of a forward slash after the word title, this means that the regex doesn't match on the old site layout causing it to not get scraped. RE: Universal Movie Scraper - Zippy79 - 2021-06-16 (2021-06-15, 22:06)olympia Wrote:I am in Japan so it's possible that IMDB is doing something based on the country you access it from. However, before the site changes the scraper always used to return the correct original title for me and never just the Japanese title for every movie.(2021-06-15, 17:53)Zippy79 Wrote: The original title field returns the Japanese title 100% of the time. I'm trying to step through the scraper code to see why it's doing it but it's pretty difficult to decipher. It's the ParseIMDBAKATitles function that scrapes the original title right? EDIT: So it's this regex that picks up the original title:
I've confirmed that when the browser language is set to English this regex returns "Raiders of the Lost Ark" and when the browser language is set to Japanese the regex returns "Reidâsu/Ushinawareta âku". So somehow IMDB is feeding the scraper the Japanese version of the page instead of the English version (based on IP address I assume). RE: Universal Movie Scraper - Zippy79 - 2021-06-16 (2021-06-16, 02:52)Zippy79 Wrote:(2021-06-15, 22:06)olympia Wrote:I am in Japan so it's possible that IMDB is doing something based on the country you access it from. However, before the site changes the scraper always used to return the correct original title for me and never just the Japanese title for every movie.(2021-06-15, 17:53)Zippy79 Wrote: The original title field returns the Japanese title 100% of the time. Replacing all instances of Code: |accept-language=en-us/ with Code: /|accept-language=en-us (i.e. moving the forward slash to before the pipe) seems to fix this issue. RE: Universal Movie Scraper - olympia - 2021-06-16 Very nice catches @Zippy79, helped me to track down the issues. Especially the one with accept-language. I am not sure I would've spotted this myself ![]() Try imdb.common v3.2.2 RE: Universal Movie Scraper - WeirdH - 2021-06-16 (2021-06-15, 16:58)olympia Wrote: I just pushed imdb.common scraper version v3.2.1 with the intention of supporting both old and new layout (I've got fed up with the testing - resulted in busy, hard to follow code, but should work). Ratings are scraped correctly with 3.2.1 again, great work. Thanks for the fixes. RE: Universal Movie Scraper - Zippy79 - 2021-06-17 (2021-06-16, 18:45)olympia Wrote: Very nice catches @Zippy79, helped me to track down the issues. Thanks for the update. All fields are looking good, except original title - there are four instances of "|accept-language=en-us/" that need changing in metadata.universal/universal.xml. With those changed then original title is scraped correctly. EDIT: I spoke too soon. This is a HTML snippet from the old IMDB layout:
and this is from the new layout:
Note the difference in the quotes around og:title. (Also I think the old site layout is factually incorrect, the original name was just Raiders of the Lost Ark, but that's beside the point ![]() RE: Universal Movie Scraper - olympia - 2021-06-17 Right, thanks for testing! I just pushed UMS version v5.5.1 please see if it is any better. |