Posts: 1,506
Joined: Nov 2013
@
ronie i wonder about the semantics for nfo files. i do not think it's wise to pass the whole thing as a parameter which is what would happen if i kept current logic - you potentially run out of argv space rather quickly.
as i see it we have two choices;
1) core resolves the path to the nfo file and this is passed to the add-on
or
2) we leave it up to the scraper to identify the nfo file and only pass entity path (problematic for artists).
Posts: 4,545
Joined: Jun 2015
Reputation:
269
2017-02-15, 16:58
(This post was last modified: 2017-02-15, 17:22 by DaveBlake.)
EDITED as I answered by own questions, so came up with more.
Q: Do we use album artist mbid(s) when doing an album search at Musicbrainz if we have them but not the album mbid?
A: No.
But would Musicbrainz search support this?
Q: Do we use artist names individually when the album is a collaboration e.g. multiple album artist names, or just the album artist description string (which may not have the names in the same order or syntax)? For example:
"Orchestral works" "Georg Friedrich Händel; The English Concert, Trevor Pinnock"
"Riding with the King" " Eric Clapton & B. B. King"
A: we use the album artist description string.
But would Musicbrainz search support using individual names, for better accuracy?
Posts: 4,545
Joined: Jun 2015
Reputation:
269
Found that niggle over automatic how scraping works!
I actually don’t like the the way that automatic scraping happens as we add each album.
Scraping albums and artists after the tag processing for all the song files in the library has been done, rather than as each album as it is added, would produce more accurate results.
Music is often tagged to a mixed standard - some albums have mbid tags, some don't. Once all the musc files have been scanned, the tag processing will have set the mbid for an artist even if it was only there on one song. That may not be on the first album by those artists, and so automatic scraping as it is would scan the artists using name alone, fetch details etc. possibly the wrong ones, only for tag scanning the next album to provide the mbid. Result mbid held along with wrong artist data.
Doing the scraping after all the scanning would also mean that we could use the artist mbids on an album search even if that album’s song files didn’t have any mbid tags. Again better accuracy.
Speed is also an issue. Get all the tags scanned and all the artists, albums and songs into the library frst and useable, then take time polling the servers for the additional information. Could even repeat after server timeouts
Posts: 1,506
Joined: Nov 2013
it's the scraper doing the extraction ronie, kodi core has no idea which urls are supported, that's up to the scraper to decide. we simply passed the *contents* of the nfo file through the relevant scraper function. hence the problem with mapping to the current API, because entry point is handed the nfo contents.
currently the artist path is taken as the deepest common path for all songs by an artist. it works well for an artist/album type directory layout, but it won't work very well in general. but that's what my music collection had so that's what i wrote the code to do
Posts: 1,506
Joined: Nov 2013
yup i know about the flaws but i don't do compilation albums, they are evil ;P
the thing you are planning is actually sort of already available through the library export / import functionality. some refinement around this and you have your stuff. i do not think it's worth complicating it beyond that, no need to store such a path in db imo.
Posts: 8,283
Joined: Jul 2014
Hi ronie and others,
As briefly discussed about on Github, I hereby share my ideas for the scrapers.
Currently it's as following:
1. Kodi scraper handles are basic information, including reading tags, folder structure and a few properties from online sources.
2. All kind of addons provide additional information, e.g. artwork or additional metadata. Addons like the cdart manager and my own skinhelper scripts.
Both kodi scraper and addons are utilising online sources in an inefficient way, meaning the online sources are scraped even if the info is already been scraped once.
The above statement is true not only for the music library but also for the video database with the difference that the videodatabase has support to scrape the additional artwork directly into the artwork table while the music db internally supports this, but isn't exposed in the json API.
Endresult: Kodi database providing basic info (probably enough for most users btw) and addons providing additional info in window properties, files or whatsoever. So basically there are now 2 (or more) sources where metadata of files can be stored.
I hope you get the point I'm making here and off course yes I know I am one of the devs that created the confusion by creating the skin helper addons.
This is what I have in mind for the future:
1. Make the Kodi scraper-engine the default for all scraping actions (metadata retrieval).
2. Make sure it supports multiple "modules" / sources. For example a basic scraper which is enabled by default and some additional scrapers like this:
- grab artwork from local directories and write it to the art table
- grab artwork from online sources ike fan art.tv
- grab additional metadata e.g. ratings from last.fm etc.
- etc.
3. A user can enable some of the additional scrapers if they actually want that metadata to be scraped to the database.
4. All metadata is available in the kodi database and available as Listitem properties, meaning usable for all skins and scenarios, no ugly window properties needed which not only overcomplicates stuff but also needs system resources to monitor listitems in the background.
5. The kodi default scraper is responsible for retrieving the correct ID's such as musicbrainz ID's, IMDB etc. No there addons should have to replicate this logic as it will only cause confusion.
So basically what I'm suggesting is the possibility to have multiple "scraper-modules" for each scenario. For example a user can activate the "animated artwork" module for movies besides the default scraper. These scraper-modules will be special python modules (or C++) which can be created just the same as other kodi addons.
To achieve this level of flexibility I think there needs to be a database table (and internal logic) in the same way as how the art table works.
I can write any key/value in the art table and its accessible as listitem property.
Maybe have the core info for a media item as separate as named fields in the database structure and have some additional table accepting key/value strings.
I think this approach looks a bit like like the same direction that Montelesse is taking with enhancing the scraper to support different sources other than the filesystem. What I'm suggesting is that you have the default scraper(module) which identifies the files into a media item. That's the basic stuff, atm this is done by scanning files on the filesystem and maybe in the future this can ba done on other ways to with montelesse's work.
When the basic info is there, the additional/optional scrapers can provide additional metadata the user wants to have, such as ratings from other sources, additional artwork etc. etc.
I understand this is a pretty big project and needs a lot of thought and work. If you feel like this is some way forward, I will help out where I can.
For now, I have rewritten my skin helper addons (as discussed on Github with several team members) because the old one caused some issues.
As a first attempt to create something universal I created the new metadata module in which I placed alle scraping logic and caching. If for example 2 addons need metadatadata from TVDB, instead of grabbing the info theirselves, they can use the module and retrieve the (cached) result.
This metadata module is a very rough first step maybe to use as "optional/enhanced" scraper ?
So, to get back on-topic. I like what is discussed here to give the music scraping some more thought and I can help out where I can, just ask. As stated in the above, I think this might even be elevated into a higher level to enhance all scraping logic with default and optional modules.