Ok I've scrapped all the questions from the movie quizzes and cleaned them up a tiny a bit. You can download the csv's at
http://rapidshare.com/files/215030762/TriviaCSV.zip .
There are three different files:
trivia_master_clean.csv - This file has double quotes and will load correctly in excel
trivia_master_SingleQuotes.csv - This file has the double quotes replaced with single quotes, rows that contain a comma in them are still encapsulated with double quotes, File should still load fine in excel.
trivia_master_noQuotes.csv - This file contains no double quotes. (Not sure if this file will be useful at all though as rows that contain commas in them will mess things up)
Looks like the total questions came out to 854, so that should keep you busy for a while
. I just did a quick glance over the data to make sure everything appeared ok and lined up correctly. I actually haven't read many of the question so I'll leave any proof reading to you if you care to do it, although the data does look pretty clean.
As for the script, I've uploaded it
here although I'm not sure how much good it will do anyone as it really can only scrape the data that's in the CSV. It might be more useful for people if just included the CSV files. It's a simple .net application using the HtmlAgilityPack (went with .net because that's what I'm currently using at work) with no comments and not much of an interface. I just hacked it together in some free time at work.
I've poked around the other quizzes a bit and getting the questions isn't hard, its getting the answers that looks tough. It looks like its doing an ajax call to their server to return the correct answer. I might mess around with it some more later if I get time but I think the 854 questions we got so far is a good start.
Let me know if there's anything else I can do to help, I like figuring out how to get the data, it's all the proofing I find a bit tedious
.