Login at Kodi Home

kimp93 · 2009-02-27, 11:58

I have a problem unicode string (korean) matching that is surrounded by lots of tab and spaces.

Code:
<strong>등급</strong></dt>

<dd>

                                                                                                                                         청소년관람불가(한국)            </dd>

What I trying to get is words between <dd> and </dd>.

Code:
<RegExp input="$$7" output="&lt;mpaa&gt;\1&lt;/mpaa&gt;" dest="8+">

     <RegExp input="$$1" output="\1" dest="7">

               <expression noclean="1">&lt;strong&gt;등급&lt;/strong&gt;&lt;/dt&gt;[^&gt;]*&gt;(.[^&lt;]*)&lt;/dd&gt;</expression>

     </RegExp>

     <expression trim="1"></expression>

</RegExp>

With this, I could get whatever between <dd> and </dd>
problem is that I can not get rid of white spaces around words.

I tried with no "noclean", "trim", /s, /t which does not help.
If I use /b, it get rid of whole string. regex engine does not seem to support /p. I looked at pcre and saying that supporting /p is option.

please guide me on this.

kimp93 · 2009-03-06, 00:30

never mind. I solved the problem.