ScraperEdit for XBMC (Java) - Printable Version +- Kodi Community Forum (https://forum.kodi.tv) +-- Forum: Development (https://forum.kodi.tv/forumdisplay.php?fid=32) +--- Forum: Scrapers (https://forum.kodi.tv/forumdisplay.php?fid=60) +--- Thread: ScraperEdit for XBMC (Java) (/showthread.php?tid=145204) |
ScraperEdit for XBMC (Java) - UsagiYojimbo - 2012-11-14 22 months after Nicezia's last response on the Scraper Editor (Based on ScraperXML open source C# Library) thread, I put together a similar editor in Java. It runs under Java 1.6, but that is all the requirements list... Project page on SourceForge, with description, screen-shots, and download links. Also available on SoftPedia. I thought it needed a standalone topic, so I opened one. Below are a list of changes: Code: v 0.1.2.66 usagi @ 2013-03-04 Known issues / planned features: Code: - Due to lack of documentation many features missing. RE: ScraperEdit for XBMC (Java) - UsagiYojimbo - 2012-11-15 (2012-11-10, 13:35)takoi Wrote: - When using the "check scraper" menu to create new functions, the tree list is not updated until you create another one via right-clickFixed. (2012-11-10, 13:35)takoi Wrote: - Removing functions does not workFixed. (2012-11-10, 13:35)takoi Wrote: - The "expression" box is cleared if you click on a regexp with empty expressionAnd why is that a problem? (2012-11-10, 13:35)takoi Wrote: - When browsing for files, there's no way to enter a hidden folderFixed. (2012-11-10, 13:35)takoi Wrote: - $INFO[language] etc, $$n and %20 in the output attribute are not substitutedThe wiki pages Scrapers and HOW-TO: Write media scrapers do not mention that $$variables should be substituted in output. Neither are $INFO and %20 mentioned. If you can give me some specification that describe these, i will implement them. (2012-11-14, 21:33)spiff Wrote: $INFO[foo] reads the string setting from resources/settings.xml (or more likely, the user data equivalent). it is replaced by the string value prior to a regexp execution, as well as in an output string.I will look into them... $$n is added. RE: ScraperEdit for XBMC (Java) - UsagiYojimbo - 2012-11-19 New version is out... RE: ScraperEdit for XBMC (Java) - UsagiYojimbo - 2012-11-28 New release is out: Added debugger.... RE: ScraperEdit for XBMC (Java) - Daniel Malmgren - 2012-11-28 I don't know how much is finished and supposed to work (didn't find any list of known issues), but I've got some things to report: Not much of the scraper details are fetched from my scraper. It shows framework version and date, the rest of the fields shows up empty. The Check Scraper function doesn't work at all, guess it's wip? The Scraper Tester doesn't work, when trying to test anything at all I get [SEVERE] hu.yvs.xbmc.xml.addon.scraper.Function cannot be cast to javax.xml.bind.JAXBElement The condition for regexp's are always greyed out. With that said, this is a nice project. I like editing my stuff in text editor, but it's easier to look at it in a gui like this without all the html encoding :-) /Daniel RE: ScraperEdit for XBMC (Java) - flobbes - 2012-11-28 I like it alot as well. Finally a scraper editor that is running under linux. The scraper tester isn't working for me either. Keep up the good work and thanks for program so far! RE: ScraperEdit for XBMC (Java) - UsagiYojimbo - 2012-11-29 (2012-11-28, 21:35)Daniel Malmgren Wrote: I don't know how much is finished and supposed to work (didn't find any list of known issues), but I've got some things to report:It is in Alpha/Pre-Beta state. There is no list of known issues. (2012-11-28, 21:35)Daniel Malmgren Wrote: Not much of the scraper details are fetched from my scraper. It shows framework version and date, the rest of the fields shows up empty.Could You provide me with such a scraper? As my Scrapers, and the XBMC core Scrapers work well. (2012-11-28, 21:35)Daniel Malmgren Wrote: The Check Scraper function doesn't work at all, guess it's wip?It had some bugs, I corrected them (all, I hope). (2012-11-28, 21:35)Daniel Malmgren Wrote: The Scraper Tester doesn't work, when trying to test anything at all I get [SEVERE] hu.yvs.xbmc.xml.addon.scraper.Function cannot be cast to javax.xml.bind.JAXBElementThis error is not related to the Tester. It is happened because You opened a Library instead of a Scraper. From next release (v 0.1.2-50 and on), ScraperEdit supports Libraries, too. (2012-11-28, 21:35)Daniel Malmgren Wrote: The condition for regexp's are always greyed out.Yes, this is because I did not found any documentation about its function or use. I just saw it in ScraperEditor (the .Net/Mono app). (2012-11-28, 21:35)Daniel Malmgren Wrote: With that said, this is a nice project. I like editing my stuff in text editor, but it's easier to look at it in a gui like this without all the html encoding :-)Thank You! RE: ScraperEdit for XBMC (Java) - opperpanter - 2012-12-15 Looks like a nice tool. I was trying to open imdb.xml from xbmc scraper. But nothing is shown in ScraperEdit. Is this because the imdb.xml uses other functions/names compared to the default ones like <NfoUrl> etc? Here's the file I am talking about: https://github.com/akuiraz/xbmc-official-scrapers/blob/eden/metadata.common.imdb.com/imdb.xml EDIT: never mind. I hadn't downloaded the file correctly from github (save as doesn't work when browsing source). RE: ScraperEdit for XBMC (Java) - daytooner - 2012-12-17 I'm using this on a linux box (Fedora 17 - 64 bit). Some of the gui stuff doesn't seem to be working quite right. In particular, in the tester/debugger, if I click on any of the number buttons (I assume they are for the variables), then the entire app freezes. In the java terminal output, there is this message: Quote: at hu.yvs.xbmc.scraper.tester.ScraperDebugger.scrape(ScraperDebugger.java:122)and a gray box in the center of the screen is left there. The only way to get rid of it, and the frozen app, is to kill the java process. I don't know if the problem is with the version of java I am using - it is Iced-Tea, not the official version from Oracle. If needed, I can install that version of java and try it. This is a great utility. Makes life just that much easier. Thx. ken RE: ScraperEdit for XBMC (Java) - UsagiYojimbo - 2012-12-20 (2012-12-17, 04:44)daytooner Wrote: In particular, in the tester/debugger, if I click on any of the number buttons (I assume they are for the variables), then the entire app freezes. In the java terminal output, there is this message: ... Thank You! Yes, the numbered buttons should display the content of the variables. The problem seems to be in threading of the different virtual machines used by the different Java distros... (I use Sun/Oracle JDK on Windows 32) I put in a "hack," that I think should correct this problem. Try v0.1.2-51, and please, report back, whether it helped, or not. PS: Use [code] instead of [quote] for long listings, please. RE: ScraperEdit for XBMC (Java) - beamer145 - 2013-01-18 (2012-11-29, 00:59)UsagiYojimbo Wrote:(2012-11-28, 21:35)Daniel Malmgren Wrote: The Scraper Tester doesn't work, when trying to test anything at all I get [SEVERE] hu.yvs.xbmc.xml.addon.scraper.Function cannot be cast to javax.xml.bind.JAXBElementThis error is not related to the Tester. It is happened because You opened a Library instead of a Scraper. From next release (v 0.1.2-50 and on), ScraperEdit supports Libraries, too. I just tried opening and debugging tmdb.xml with version 0.1.2-51 and I still have this problem (windows 7 x64) But I have not played with this tool before so maybe I am doing something wrong: [1] I just open tmdb.xml [2] Select Scraper in the tree (or should I select an individual function ?) [3] I select Tools/Scraper debugger [4] I enter something in the 'title' field (not really sure what this field is meant for, eg is this what will be passed in $$1 for CreateSearchUrl ? (but what about $$2 ?), should it be a filename, a complete path, something else ? ) [5] I select debug [6] I get Initializing/Debugger Setup/ Some checks and then the error. Edit : I also do not know what you mean by the concept of 'library' here RE: ScraperEdit for XBMC (Java) - UsagiYojimbo - 2013-01-21 (2013-01-18, 01:47)beamer145 Wrote: I just tried opening and debugging tmdb.xml with version 0.1.2-51 and I still have this problem (windows 7 x64) Library is a scraper file, that has no <scraper> tag, but a <scraperfunctions> tag instead. This tag has no attributes, but can contain any functions, just like the <scraper> tag can. Which "tmdb.xml" did you used? Is it "metadata.common.themoviedb.org/tmdb.xml" (library) or "metadata.themoviedb.org/tmdb.xml" (scraper)? What Java version are you using? RE: ScraperEdit for XBMC (Java) - beamer145 - 2013-01-21 It was the scraper ( "metadata.themoviedb.org/tmdb.xml" ). After upgrading to jdk-7u11 the problem seems solved. Thanks for the tip ! Other remarks/questions after a quick try: - Unfortunately at the moment XMBC does a ToLower on the filename string before passing it to the scraper in $$1 for CreateSearchUrl (cs only influences the regexp, not the input string). Your app leaves the case intact. - When I select my CreateSearchUrl and run Scrape or Debug, he seems to stop after my second nested regexp. No errors are reported and the dest buffer of this second regexp is not yet filled in (button remains grayed out). I am not sure what is going on, I suppose it should have continued all the way ? If you want to try for yourself, replace the CreateSearchUrl in the tmdb.xml scraper by the thing below. He stops after <RegExp input="$$1" output="\1\2" dest="5">, and 5 is not filled in.... <CreateSearchUrl dest="3"> <RegExp input="$$9" output="<url>http://api.themoviedb.org/3/search/movie?api_key=57983e31fb435df4df77afb854740ea9&amp;query=\1&amp;year=$$4&amp;language=$INFO[language]</url>" dest="3"> <RegExp input="$$1" output="\1" dest="4"> <expression noclean="1" clear="yes">.*_([0-9][0-9][0-9][0-9])</expression> </RegExp> <RegExp input="$$1" output="\1\2" dest="5"> <expression cs="yes" noclean="1" clear="yes">(.*)_[0-9][0-9][0-9][0-9]|(.*)</expression> </RegExp> <!-- replace underscores by spaces --> <RegExp input="$$5" output="\1%20" dest="6"> <expression cs="yes" noclean="1,2,3,4,5,6,7,8,9" repeat="yes" clear="yes">([^_]*)_*</expression> </RegExp> <!-- Split eg Bx,DF,900,Conqu,F,500,Est, ,Of,PQR,Qaradi,PPP,Pse,BFG,900. Remark: explanation of the extra _ in next step --> <RegExp input="$$6" output="\1 \2 \3_\4 \5_\6 \7_" dest="7"> <expression cs="yes" noclean="1,2,3,4,5,6,7,8,9" repeat="yes" clear="yes">([a-z]*)([A-Z]*)([A-Z][a-z])|([a-z]+)([A-Z]+)|([A-Za-z]*)([^A-Za-z]*)</expression> </RegExp> <!-- In the previous step, only one of the subparts (x,y and z in x|y|z) of the regexp matched, the 2 others will have remained empty and introduced " _" after the new string. With the extra _ we can reliabely detect the unwanted spaces to remove them because the results of each run should be glued together and not separted by spaces resulting from the subparts that did not match. Note that there were no _ when we started cos we stripped all of them already, so it is safe to reintroduce them. Note: aarrgg each of the submatches (eg \1) are also stripped of leading/trailing whitespaces before they are put in the output buffer (spaces which we need to preserve here), but luckily this can be disabled with noclean. --> <RegExp input="$$7" output="\1\2" dest="8"> <expression cs="yes" noclean="1,2,3,4,5,6,7,8,9" repeat="yes" clear="yes">([^_]*) _|([^_]*)_</expression> </RegExp> <!-- Replace remaining spaces by %20 for url compatiblity--> <RegExp input="$$8" output="\1%20" dest="9"> <expression cs="yes" noclean="8" repeat="yes" clear="yes">[ ]*([^ ]+)</expression> </RegExp> <expression noclean="9" /> </RegExp> </CreateSearchUrl> ( This one converts MovieNameInCamelCase_YEAR formatted folders to %20 seperated words, it works but unfortunately you need to rebuild XBMC with the unwanted ToLower operation on the file name commented out for it to work, but this is not a problem for your scraper debugger) RE: ScraperEdit for XBMC (Java) - UsagiYojimbo - 2013-01-23 (2013-01-21, 21:30)beamer145 Wrote: After upgrading to jdk-7u11 the problem seems solved. What Java version did you use before the upgrade? Yes, XBMC cleans up the media-filename to create a search string from it. (Removes words like DVD, DC, Rip, DivX, etc, and convert to lower case,.) ScraperEdit do nothing like this, deliberately. RE: ScraperEdit for XBMC (Java) - UsagiYojimbo - 2013-01-28 (2013-01-23, 10:46)UsagiYojimbo Wrote:(2013-01-21, 21:30)beamer145 Wrote: After upgrading to jdk-7u11 the problem seems solved.What Java version did you use before the upgrade? Just ran a test on the JRE of JDK 1.6 (Win, 64 bits), and got the problem. Code: java version "1.6.0_21" I would check this incompatibility issue between JRE's 1.6 and 1.7... |