Posts: 639
Joined: Feb 2004
Reputation:
0
Clumsy
Team-XBMC Forum Moderator
Posts: 639
If this gets added, I personally vote for "disabled by default but optional". I personally wouldn't want data like that to be sent around the net associated with an IP address, although I can see how it has the potential to reduce server load with TMDb.
Posts: 1,545
Joined: Oct 2008
Reputation:
31
fekker
Posting Freak
Posts: 1,545
While it might reduce the load on the tmdb server, it's got some things that might be of concern.
digital fingerprint of sorts, ip + hash of specific version of media file
dvd backups won't match unless they where done with the same app in the same manner
the many many different versions of movie files out there
i'd have to agree on the please off by default, option to remove / disable it.
the imdb id is a uid that already does a similar function without a specific link to the media file itself. And there's the tmdb id as well, which could be added to information about the media and parsed just like the imdb id could.
Posts: 4,061
Joined: Oct 2007
Reputation:
90
zag
Team-Kodi Member
Posts: 4,061
Just to re-assure people that usage of the API is anonymous and no IP addresses are stored as far as I know. Only the application API key is transferred.
The hashing will only be stored after 4 submissions of the same key so it should get all the popular internet releases and 1 click ripping solutions.
The open subtitle directory has been using this method for a while and there have been no problems like you describe that I know of. TMDb is using the same algorithm for cross compatibility.
Posts: 139
Joined: Aug 2008
Reputation:
1
I'm not doing this to help load, it's the holy grail of guaranteed title matching... by removing the fuzzy file name searches, we can greatly increase the chance of a the movies being picked up properly.
Opensubtitles has been doing this for a while now, and yes, we are using their technology to make sure we're all doing it the same way. thetvdb.com has also said they'll be doing it as well. It's got nothing but benefits (I can't think of a single negative really) so it's a no brainer for us.
I've already been chatting with the XBMC guys, and it's on the table to be discussed. They have not given me a firm yes or no either way (yet), but it is interesting to hear all of your thoughts.
I don't store any personal information, so you don't need to worry about anything like that. It's completely anonymous. The only things uploaded are the hash of the file, the imdb_id and that's it.
Imagine not caring what or how your files are named, and XBMC just like magic, knowing what titles they are. Doesn't that sound pretty awesome?
Posts: 4,997
Joined: May 2004
Reputation:
12
I don't see the advantage here. You'll need a new hash for every version of every encoder and every container. Not to mention tags (which are a far better idea) completely ruin everything. Analyzing the raw encoded file data is useless. You need to devise a fingerprint based on the decoded content, similar to the musicbrainz ID. The musicbrainz ID may even be usable here w/o modification (assuming it's FLOSS), it will probably take ages to calculate though. You would still need a hash for every audio stream of every edition of the film, but surely that's far fewer permutations than with the current proposal.
Posts: 5,292
Joined: Jun 2006
Reputation:
62
Jezz_X
Team-XBMC Skinner
Posts: 5,292
I'd like to point out that the only people that really benifit from this are people who download the same movie as everyone else much like opensubtitles.org use scene release names for the subtitles. This has little to no benifit to people who actually make copies of their dvd's they own. Because like everyone else said depending on what program you use to rip the file or what codec you use and so on the hash will always be different
Posts: 26,215
Joined: Oct 2003
Reputation:
187
We're still discussing this internally, so note that this is my opinion only.
I can see why this idea may be useful for the subtitles people, as the subs people download are designed for a particular encode of the movie, and may not work with other encodes of that same movie (timing issues).
However, I see no real benefit in the case of looking up metadata in terms of improving efficacy:
1. It's fairly clear that the primary beneficiary are those who obtain "scene" releases, which brings implications of copyright infringement, and in turn possible privacy implications.
2. All such releases already have .nfo files with the download (else the "scene" rejects them), which I believe already have the imdb id in them, thus from XBMC's perspective, the gains appear minimal.
3. Clearly the 'hash' is computed based on the encoding, not based on content, so it's not particularly useful for identifying an item anyway (multiple 'hash's for each movie). The IMDb id (or tmdb id for that matter) is appears better for this purpose, as it is unique to the movie, not the encoding.
Cheers,
Jonathan
Posts: 1
Joined: Sep 2009
Reputation:
0
These hashes are "bit-for-bit" which means that even the slightest difference between two files will result in a different hash. This would include "legit" downloads as each purchase would result in a different DRM wrapped file. In my opinion the hashing is absolutely useless in all cases except for torrent downloaders. I personally rip all my own dvds to ISOs stripping out all the extra stuff (menus, special features, etc.) leaving just the main title. Hashing those would be virtually useless, as others would have to do the exact same thing. And even then, I'm not so sure it generate the exact same file. I ripped a dvd a long time ago and then re-ripped it just the other day the exact same way using the same app and the hashes were different.
- Josh
Posts: 4,997
Joined: May 2004
Reputation:
12
If someone seriously wants hashing, take a look at libofa and improve it to work well movies. It generates an audio fingerprint, but is intended for music and only accepts 135s of samples. This won't likely be enough to be accurate do to possibility of silent opening credits or the same intro song in a film.
Posts: 15
Joined: Nov 2008
Reputation:
0
2009-09-25, 19:34
hi
I think doing hash search will be a waste.. instead themoviedb could have a group of fields called ID-Cross-Reference and we could have field names like
imdb
ofdb
...
...
That way we can use only one scapper for TMDb and missing info can be fetched from other sources pointed to by TMDB
G