Couple things. I'm a python novice, but I think this is right.
In gui.py, this looks backwards:
Code:
if (artist.find( "\'" )) or (song.find( "\'", )):
quote = False
else:
quote = True
it should probably be
Code:
if (artist.find( "\'" )) or (song.find( "\'")):
quote = True
else:
quote = False
Right?
But even better, I had better luck ignoring that bit and instead modifying the lyricwiki_format function in the LyricsFetcher class to this:
Code:
def lyricwiki_format(self, text):
titleCase =lambda value: re.sub("([a-z])'([A-Z])", lambda m: m.group(0).lower(), value.title())
return urllib.quote(str(unicode(titleCase(text))))
I got that lambda magic from
http://bugs.python.org/issue7008. Its not perfect, but it catches a lot. I also modified the get_lyrics_thread function to only use lyricwiki_format, and also hopefully improved detecting failed searches. Here's my full function:
Code:
def get_lyrics_thread(self, artist, title,quote):
try:
url = "http://lyricwiki.org/index.php?title=%s:%s&fmt=js" % (self.lyricwiki_format(artist), self.lyricwiki_format(title))
song_search = urllib.urlopen(url).read()
if song_search.find("Click here to start this page!") >= 0:
error = "Lyrics not found"
return error
song_title = song_search.split("<title>")[1].split("</title>")[0]
print song_title
song_clean_title = self.unescape(song_title.replace(" Lyrics - LyricWiki - Music lyrics from songs and albums",""))
print "Title:[" + song_clean_title+"]"
lyricpage = urllib.urlopen("http://lyricwiki.org/index.php?title=%s&action=edit" % (urllib.quote(song_clean_title),)).read()
print ("http://lyricwiki.org/index.php?title=%s&action=edit" % (urllib.quote(song_clean_title),))
content = re.split("<textarea[^>]*>", lyricpage)[1].split("</textarea>")[0]
if content.startswith("#REDIRECT [["):
addr = "http://lyricwiki.org/index.php?title=%s&action=edit" % urllib.quote(content.split("[[")[1].split("]]")[0])
content = urllib.urlopen(addr).read()
try:
lyrics = content.split("<lyrics>")[1].split("</lyrics>")[0]
except:
lyrics = content.split("<lyric>")[1].split("</lyric>")[0]
return lyrics
except:
error = "Fetching lyrics failed"
return error
It hasn't been tested very rigorously yet, but it's now finding lyrics that it wasnt before. Mostly it seems better because it's now title casing everything, so simple cases like "XTC:River of Orchids" is working.