Hi Spiff,
not so easy for me, ouch...
In the GetDetails function of the scraper:
Code:
<RegExp input="$$2" output="<url function="GetIMDBPlot">$$3plotsummary</url>" dest="5+">
<expression/>
</RegExp>
this is running fine and calls my IMDB func (in common directory):
Code:
<?xml version="1.0" encoding="utf-8"?>
<scraper framework="1,1" date="2010-06-12" name="IMDB Functions" content="movies" language="de">
<include>ofdb_de.xml</include>
<GetIMDBPlot dest="5">
<RegExp input="$$3" output="<details>\1</details>" dest="5">
<RegExp input="$$1" output="\1" dest="2">
<expression clear="yes"><div id="swiki.2.1">\n\n([^\n]+)</expression>
</RegExp>
<RegExp input="$$2" output="<plot>\1</plot>" dest="3">
<expression>(.+)</expression>
</RegExp>
<RegExp conditional="getofdbplot" input="$$1" output="\1" dest="4">
<expression><link rel="canonical" href="http://www.imdb.de/title/([t0-9]*)</expression>
</RegExp>
<RegExp conditional="getofdbplot" input="$$2" output="<url function="GetOFDBURL">http://www.imdb.de/title/$$4/</url>" dest="3">
<expression>^$</expression>
</RegExp>
<expression noclean="1"/>
</RegExp>
</GetIMDBPlot>
</scraper>
Ok, if we do not find any plot, have a look at the OFDB site:
Code:
<?xml version="1.0" encoding="utf-8"?>
<scraper framework="11" date="2010-06-12" name="OFDB Functions" content="movies" language="de">
<GetOFDBURL dest="5">
<!--<url function="GetOFDBLink">http://www.ofdb.de/view.php?SText=\1&Kat=IMDb&page=suchergebnis</url>-->
<RegExp input="$$1" output="<plot>OFDB Function</plot>" dest="5">
<expression><link rel="canonical" href="http://www.imdb.de/title/([t0-9]*)</expression>
</RegExp>
</GetOFDBURL>
<GetOFDBLink dest="5">
<RegExp input="$$1" output="<url function="GetOFDBOutTagline">http://www.ofdb.de/\1</url>" dest="5">
<expression><br>1. <a href=".*?([^"]+)</expression>
</RegExp>
</GetOFDBLink>
<GetOFDBOutTagline dest="5">
<RegExp input="$$1" output="<details><outline>\1</outline><tagline>\1</tagline><plot>\1</plot></details>" dest="5">
<expression><b>Inhalt:</b>([^<]+)</expression>
</RegExp>
<RegExp input="$$1" output="<url function="GetOFDBPlot">http://www.ofdb.de/plot/\1</url>" dest="5+">
<expression><a href="plot/([^"]+)</expression>
</RegExp>
</GetOFDBOutTagline>
<GetOFDBPlot dest="5">
<RegExp input="$$3" output="<details>\1</details>" dest="5+">
<RegExp input="$$1" output="\1" dest="2">
<expression noclean="1">Eine Inhaltsangabe von(.*)Zur &Uuml;bersichtsseite des Films</expression>
</RegExp>
<RegExp input="$$2" output="<plot>\1</plot>" dest="3">
<expression noclean="1"><br>([^<]+)(?:</font>)</expression>
</RegExp>
<expression noclean="1"/>
</RegExp>
</GetOFDBPlot>
</scraper>
As far as I can see, the OFDB feature has a problem/is never used. It's not a typo at the conditional flags: if I delete them, prblem still exists.
On my paper, pen and mind it works and one URL after the other is fetched and checked by the scraper.
What went wrong?
Regards,
Eisbahn