Problems with cleanstrings - Printable Version +- Kodi Community Forum (https://forum.kodi.tv) +-- Forum: Support (https://forum.kodi.tv/forumdisplay.php?fid=33) +--- Forum: General Support (https://forum.kodi.tv/forumdisplay.php?fid=111) +---- Forum: OS independent / Other (https://forum.kodi.tv/forumdisplay.php?fid=228) +---- Thread: Problems with cleanstrings (/showthread.php?tid=76044) |
Problems with cleanstrings - agrajagzz9 - 2010-06-22 Hi guys I would really appreciate some help debugging my cleanstrings entry in advancedsettings.xml. By my understanding, cleanstrings is used to extract a useful search string from a filename for the purpose of scraping, by removing unwanted characters. Let me know if I've got the wrong idea here. My problem is that I have a lot of movie series named with the convention seriesname - ## - moviename.xxx and high definition movies with a [HD] flag at the start of the filename. This really seems to confuse the scraper, and for obvious reasons I wanted to avoid renaming all of the files. So I was relieved when I read about advancedsettings.xml and cleanstrings. At first I tried using a more general expression but when that didn't work, I wanted to see if I could confirm it working for a specific string. So I have moved my james bond films into a subfolder and repeatedly added and removed the folder using different cleanstrings expressions, but it hasn't worked once. I have pasted some extracts of the important parts from xbmc.log below (with debugging on). Code: 14:12:20 T:3924 M:2603417600 NOTICE: Loaded advancedsettings.xml from special://profile/advancedsettings.xml So you can see that it correctly loads the advanced settings file. But when the search is done, the part I wanted to remove is still in the string. I used the simplest possible expression, so I can't see what I could have done wrong unless I completely misunderstood something. Thanks in advance for your help - agrajagzz9 - 2010-06-23 I have tried using a lot of different cleanstrings expressions, and have actually managed to get some of them to have an effect. However it seems that instead of removing only the matched string, everything after it is also removed. For example, using the same scenario as my previous post, I tried (- .. -), expecting it to remove only the "- 01 -" parts, but everything after that is also removed, leaving just the string "007". I had assumed that each <regexp> item would remove only the matched parts from a filename, because the wiki doesn't give a very detailed explanation of what it is supposed to do. Can anyone confirm exactly how cleanstrings is supposed to work? - spiff - 2010-06-23 Code: (\[.*\]) - agrajagzz9 - 2010-06-23 spiff, In my experience, it will not only remove anything in brackets, but also everything after the first match. For example, if a file was named "apples [oranges] bananas" you would expect it to be changed to "apples bananas" after cleaning, removing only the string in brackets. But in my case it will be changed to just "apples ". Is this normal or only happening to me? - spiff - 2010-06-23 checked the code, you are correct, that's how it behaves. your original problem - xbmchead - 2010-06-23 agrajagzz9 Wrote:Hi guys Lets focus on your actual problem, you really need to rename your files but don't want to invest the time required, Yes? Just get one of the many GUI based file renamer programs available for the mac and pc, you'll find your original problem isn't really as hard as you thought. If it was me I'd just write a Q&D perl program to rename the files, but it doesn't sound like regex is your friend. Having a consistently named library is a worthwhile excersize. |-<:) - agrajagzz9 - 2010-06-24 xbmchead, I have no problem using regex renaming tools to rename my files, and in fact i have used them extensively to get my library how it is at the moment. I happen to like how my files are named and didn't want to change that just to use xbmc. I'm not sure what the intention was when implementing the cleanstrings feature, but it seems like it would be much more useful if it allowed you to remove strings from anywhere within the filename. The way it works at the moment, everything to be removed has to be at the end of the filename, otherwise the useful parts are also removed. Actually, with this implementation most of the default entry is unnecessary, you could just match the first bracket and everything after it would be wiped out as well. So cleanstrings is pretty limited in what it can do. - jmarshall - 2010-06-24 Just using the first bracket would match too much. Patches are welcome to improve it - agrajagzz9 - 2010-06-24 I don't have any experience with large projects, but i'm looking into it at the moment... RE: - OndrejPopp - 2012-11-01 (2010-06-24, 00:45)jmarshall Wrote: Just using the first bracket would match too much. how about this one? This would be consistent with the re-match "cutting" behaviour, as mentioned in this thread. The current implementation only cuts off the tail, so there i no way to use it to cut off the head, for example, there is no way to transform I.Planet-of-the-apes into Planet-of-the-apes, so cutting off the "I." head. With the following patch this becomes possible, Code: diff --git a/xbmc/Util.cpp b/xbmc/Util.cpp now with the following cleanstrings addition, Code: <video> you get, Code: 00:15:07 T:140122601133824 INFO: trying to match 0:[0-9IVX]+[.] on <I.Planet-of-the-apes> You can make the patch probably shorter by using, char * CRegExp::GetReplaceString ( const char * sReplaceExp ) but I am not familiar with this library, so I have not figured out (yet) how to specify the sReplaceExp (probably something like /matchRe// ) and have not been able to find a reference guide, I was just looking through the doxygen generated docs... RE: Problems with cleanstrings - dragonflight - 2012-11-04 There is a way to remove the front of a string by using cleandatetime EXCEPT there is a bug in the cleanstrings routine that prevents it from working. (the original match is on strFileName, but the replacement string is from strTitleAndYear - your code corrects this problem) The way cleanstrings currently works is to clean everything after a match, so I think changing it to just delete the match (as you are doing with left and right) is wrong IMO, but otherwise I changed my personal copy do just this. Code: - if ((j=reTags.RegFind(strFileName.c_str())) > 0) I didn't submit it yet, because I wanted to introduce a new mechanism where cleanstrings would return a list of possibilities, but I haven't gotten around to it yet mike PS I laughed because I looked into this because of Planet of the Apes as well (it was the first of many!) RE: Problems with cleanstrings - iondream - 2014-07-16 Is anyone still interested in finishing and submitting this? It would be greatly appreciated to have functional searches without updating your file names/file structure. Thanks Related http://forum.xbmc.org/showthread.php?tid=144315&pid=1753789#pid1753789 http://trac.xbmc.org/ticket/13977 https://github.com/xbmc/xbmc/pull/1730 RE: - meimeiriver - 2019-01-17 (2010-06-24, 00:37)agrajagzz9 Wrote: I'm not sure what the intention was when implementing the cleanstrings feature, but it seems like it would be much more useful if it allowed you to remove strings from anywhere within the filename. The way it works at the moment, everything to be removed has to be at the end of the filename, otherwise the useful parts are also removed. Couldn't agree more! cleanstrings should just remove the first match (and not basically append a hidden .* to the regex). Also, the documentation is wrong. "Please note that everything right of the match (at the end of the file name) is removed." Should apparently be "... everything UP TO AND INCLUDING right of the match (at the end of the file name) is removed". Plus '(at the end of the file name)' makes no sense either, as it's simply everything up to and including the match string. RE: Problems with cleanstrings - mebia - 2021-02-08 I name my movies in the same way as @agrajagzz9 and have also resorted to manually matching them. Any chance on resurrecting this feature request? It would be a perfect solution for me. EDIT: I decided to take this on: https://github.com/xbmc/xbmc/pull/19219 I slightly modified the solution proposed by @dragonflight. |