December 03, 2005

You're not paid to make me think!

In this house we watch a lot of movies. It's funny, really; I can't remember the last time Jacob and I watched a TV show, but movies and video games account for between 5 and 10 hours of a typical week for us. Hence, GreenCine.

GreenCine, for those not familiar with it, is a Netflix clone with a slightly different selection: more independent movies, more anime, more noir. At this point our rental queue comprises some 200 titles, half of which neither of us has actually heard of. You see, GreenCine has this interesting feature known as Lists. Any GreenCine member or staffer can put together a linked list of movies with some common feature, give it a title like Love Stories About Losers, and put it up for viewing by other members. Jacob went on a list binge when we first signed up, and added half of the Noir and Crime lists to our queue. Then he went on and did the same with Significant Early Films and Important Science Fiction. Jacob Has An Interest In Cinema.

I, on the other hand, Just Want To Be Entertained Goddamnit. It's a recurring problem with our membership; I'm already burned out on the genres that Jacob likes the most, so I greet those plasticky green envelopes in the mail with considerable skepticism. They might have likable characters and fun, engaging stories, or they might have bitter detectives and grim, serious dialogue. They might be Waking Ned Devine, or they might be Rififi. Sometimes the fear of the latter is enough to stop me from opening the envelopes at all; I leave them on the coffee table and hide in bed with a book until Jacob comes home and opens them. Or I open them, swear under my breath, and then add fifty movies from the Black Comedy list all in a rush. It might be hard-boiled from here to March, but by god, when April rolls around it's all going to be old naked Irish men on motorcycles.

If we used Netflix instead, we could have multiple queues and solve the problem easily. Then again, Netflix doesn't have 500 depressing 1950s crime movies, so Jacob wouldn't have anything to rent anyway. That's hardly helpful. I guess there's nothing for it but to go for extremes and fill the top of the list with Monty Python movies when he's not looking.

"Oh, darn, sweetie, they sent us slapstick comedy instead of moral ambiguity. Must have been some kind of mistake, but, you know, since we don't have anything better to watch... why don't you make some popcorn?"

Posted by dianna at December 3, 2005 06:51 PM
Comments

I subscribe to Netflix (although, thanks to the end of the semester and impending finals, I've been sitting on three movies for a month now). When I first joined I went on a binge, adding every movie they had that I had ever wanted to see. I wound up with a 500+ entry queue. The problem was that I had a generalized desire to watch movies, rather than a specific desire to watch a given kind of movie, but I had added the movies in moods and phases, so I started out with, say, screwball comedies, then added classics, then sci-fi, etc. Being short of attention span, I craved a variety in my viewing experience that my queue, as constituted, could not provide. And, at 500+ entries, sorting them according to the order I wanted to watch them would have been a task so enormous as to be infeasible.

The solution I hit upon was to write a macro program that, with a key stroke, would go through every text box on Netflix's queue screen, delete the number there, and insert a random number between 1 and 999 (The biggest number you can put in their text box). I thus randomized my queue, ensuring that the genres of my movies would be nicely blended. It also meant that I had no idea what would be coming next when I returned a movie, since I made up a rule that I couldn't ever look at my queue once randomized. The punishment for unlawful knowledge of the queue's ordering was re-randomization of the queue (hence the importance of having a macro that could randomize the queue at a keystroke).

It strikes me that randomizing your queue would be a way to create more parity in movie distribution. Assuming both Jacob and you have equal numbers of movies in the queue, once randomized the next movie you get from the queue is equally likely to be a Jacob movie as it is to be a Dianna movies.

Posted by: Zach S. at December 3, 2005 08:27 PM

It's kind of amazing what these kids who know how to work these "computers" can do. Zach's solution is amazingly elegant and so much better than my feeble attempts to manually rerandomize my 280-movie-long queue every time I add or delete a bunch of stuff or have to move things around. If I could figure out how to make the proper punchcard for such a thing, I'd be thrilled.

My other major problem, which there's kind of no good way around, is that I use my Netflix both for Stuff I Want To See (a fairly equitable mix of fluff that demonstrates my Desire To Be Cheaply Entertained and Films that demonstrate my Interest In Cinema) and Stuff I Gotta See For Academic Purposes. So every time I get my queue all nicely set up, I have to start bumping the stuff I want to see down to make room at the top for, say, 12 Civil War documentaries. So all the stuff I Want To See ends up sitting at the bottom of the queue, and the stuff I Need To See sits in my room, while I go over to Hollywood Video and pay a bunch of money to rent some actual DVDs I actually want to watch and can get right now.

Di, I'm sympathetic to your position & I think that Zach's right about list parity. I think you should make 3 categories: 1) Stuff Dianna wants to watch, 2) Stuff Jacob wants to watch, and 3) Stuff neither of you really wants to watch, and re-order your queue 1-2-3-1-2-3-etc. That way, you'll take turns being bored, and then every third movie you'll sync up on the experience.

Posted by: katie at December 4, 2005 03:28 PM

I have to confess, per Katie's comment, that last weekend we looked at our current GreenCine selections (Alphaville, a depressing 1960s French sci-fi movie, and Atomic Cafe, a depressing history of the atomic era through contemporaneous news and propaganda footage) and unanimously decided to go to Reel Video and rent Charlie and the Chocolate Factory and Red Dwarf Season 3 instead. At one stroke we raised our monthly expenditure on movies by 50%.

So, Zach, about this program. Can I get it from you somehow? If GreenCine's queue page works a great deal like Netflix's, will the program still work? Oh please oh please? It's already so hard-boiled around here that I've started eating myself for breakfast with a little bit of salt.

Posted by: Dianna at December 4, 2005 04:53 PM

Hum. I actually lost the program, along with everything else on my old computer, when I formated the old computer's hard drive and sold it to the Used Computer Store in Berkeley.

However: I remember the gist of how the program ran. I've been meaning to re-create the program, and I need a break from studying, so I can probably get something working tonight.

It would be helpful if you could tell me what internet browsers you have available. I absolutely know it works in Opera, and I absolutely know it doesn't work in Internet Explorer. I'm not sure about Netscape or Firefox or whatever it is the kids today are using.

I also can't guarantee it'll work in Greencine, if the queue is layed out differently than Netflix.

Actually, you can solve both problems at once if you conduct an experiment for me and report the results: In your browser, go to the Greencine queue. Select the first text box for entering the queue number. Now hit tab until you reach the next text box, and count the number of times you have to hit tab to get there. Repeat, say 10 times and make sure it's the same number of tabs each time. If it is, everything should be all right. Just tell me how many tabs it takes to move between boxes. If it requires a variable number of tabs to get to the next box (as it does with Netflix in Internet Explorer), there'll be trouble. See if any other browsers have a constant number of tabs.

Posted by: Zach S. at December 4, 2005 05:58 PM

We're a Firefox household, but unfortunately (due to the GreenCine queue layout), it takes a variable number of tabs. Almost every time it takes four, but sometimes (if a movie in the queue isn't associated with a list) it takes three. So as soon as you hit one of those non-list movies, everything would get screwed up.

Posted by: Jacob at December 4, 2005 10:17 PM

Harumph.

Alright, I downloaded Firefox and it has the same problem with the Netflix queue as Internet Explorer. I tried joining GreenCine on the free program, in the hopes that it would let me see a queue, but no dice. I've been trying to upgrade to the 1-at-a-time plan so I could get a queue, but GreenCine's server is being silly and refusing to process certain requests (credit card billing and movie database searches, from what I've seen).

So: I have a hunch that the macro would work with GreenCine, but only if you use the Opera browser. The key is that, whereas Internet Explorer and Firefox go to the next textbox, button, or link when you hit Tab, Opera goes to the next textbox or button. So, if the GreenCine queue layout is analagous to Netflix's, the macro should work if you're using the Opera browser (available at www.opera.com)

I will try again with Greencine tomorrow and see if I can't get a version of the macro tweaked to work with it.

Posted by: Zach S. at December 4, 2005 10:54 PM

Program done... works in GreenCine... must use Opera... Get it here. Now sleeeeeeeeeeeep....

Posted by: Zach S. at December 5, 2005 04:13 AM

Success! A blinding revelation struck my brain as I was falling to sleep last night, and it provided the key insight needed to complete Queue 2! Queue 2, while not nearly so elegant as its older sibling, works in Internet Explorer and Firefox.

Posted by: Zach S. at December 5, 2005 11:35 AM

Holy sweet bejesus. Zach, remind me to issue you programming challenges more often.

Thank you! I will rush home and try out Queue 2 as soon as I am able. Actually, maybe I'd better rush home, add more movies that I want to see, and then try out Queue 2. I'm way behind on sheer numbers here.

Posted by: Dianna at December 5, 2005 03:08 PM

You have no idea how much I want to write a long, boring monologue on the trials and travails I had in writing these programs, the interesting problems they posed and their shortcomings as currently constituted (For instance, the numbers they generate are pretty random, but I'd kinda prefer they were randomer. As it is, they'll end up preserving some of the order in your list. But getting truly random numbers would take a far more complex program than I am capable of. I have an idea how I'd do it, but no idea how to implement it. Kludgey solution: Run the randomizer, click update, and repeat a time or two and chances are any original order will be completely destroyed).

And this is probably the outward limit of my programming capability; there's a reason I came within a single point of failing CS61A.

Posted by: Zach S. at December 5, 2005 03:37 PM

Zach, I just thought I'd let you know that I've just capitalized on your helpfulness to my sister by pirating the IE version that you came up with, and this thing rocks. IE + Netflix = worked like a charm. I inadvertently saw some of my reordered queue when I went in to put the series discs back in order (can't watch episode 5 before episode 3, although Netflix itself has fucked that up on more than one occasion) and the random queue is way better than my often tightly-organized monstrosity. You're a genius!

Posted by: katie at December 14, 2005 09:08 PM

Aw, shucks. 'Tweren't nothin'. It's not as random as I'd like, and I wish it didn't do that nonsense at the end, but it's the best I could do with my meager programming skills.

When I first started Netflix, I would go to Random.org and have it generate a non-repeating sequence of numbers as long as my queue, then input the numbers manually. By the time my queue reached about 200 entries this became quite an ordeal. Moreover, I'd end up seeing which movies were where in the queue and that screwed up my whole not-knowing-what's-coming thing. So I spent an evening putting my initial version of the program together, and there you go.

Posted by: Zach S. at December 14, 2005 09:51 PM

The not-knowing-what's-coming thing is a huge benefit. My new DVD is more like a birthday present than something that I know is coming and I *groan* have to do.

I did note that your program seems to really like the number 58. Seemed like all my rentals were vying for that spot. This is where I start getting really iffy on things like probability: does it treat each number generation (in each text box) as a wholly new thing, any number between 1 and 999, regardless of what's come before? So if I got 6 58's, what're the odds of that?

I was forever messed up by a high school friend of mine who told me, when we were studying probability in math class, "Look. Don't believe what the book says, everything always has a 1 in 2 chance of happening. I mean, it does or it doesn't, right? That's 1 out of 2." I know it's wrong, but I've never been able to get rid of that idea.

Posted by: katie at December 14, 2005 11:40 PM

Yeah, I've heard that (the 1 in 2 thing). The problem, of course, is that while it's true in a vague sense, and while it would make probability and statistics classes a whole lot easier, it isn't actually very useful. "Gallup has released a prediction that either George W. Bush will be reelected, or he won't. Another finding of the study: either Ralph Nader will be elected, or he won't."

There's a more detailed explanation in the comments to the post where I put the program up, but it actually doesn't generate a number between 1 and 999; it generates a number between 1 and X, where X is the number you input for the length of your queue. This is because GreenCine ignores all new queue values that are higher than the total number of entries in your queue, so if you have 50 items in your queue and you give one of the entries a value of 51, Greencine leaves that entry exactly where it was before, even if it was, say, the first item in the queue. This isn't a problem in Netflix.

As such, you're going to get a lot of repeat numbers, because the chances of generating, say, 58 random numbers and having them all be unique are incredibly small, and they get smaller as the quantity of numbers increases.

Also, it may be that the random number generator built into the program I used tends to get 58 more than it should; I'm lead to understand that truly random number generation is really hard, so the random numbers you get from a simple computer program are merely pseudo-random.

Of course, it's also possible that you just got a run of 58s. The thing that tends to throw people off about randomness is that when each number generation is independent, as they are here, the next number is entirely independent of the last number. So it's theoretically possible to generate 58 58s and still be random. The chances of that happening are very low, but it's still possible.

Posted by: Zach S. at December 15, 2005 07:36 AM

Ooh! But the odds of getting 58 58s are exactly the same as the odds of any other given string of 58 numbers!

Wait. Now I'm confusing myself. If you're calculating the odds of getting 58 58s all in a row, do they wind up comparable to the odds of any other set of numbers (e.g. getting one each of every number between 1 and 58 in any order) or of any other sequence of numbers (e.g. the numbers 1 to 58 in order)? It must be the latter. What is that number -- 1 over 58 to the power of 58? If I'm using my little Windows calculator correctly then it's something like 5 x 10^-103. That's one over 200 googol. I think.

As I was saying, though, it's no less likely than getting your 58-movie queue back in precisely the same order as you had before, or rather, it wouldn't be any less likely if the random number generator were truly random. Whew.

Posted by: Dianna at December 15, 2005 09:12 AM

Hum. There're formulae that would help us here, but I learned them in my Statistics class 3 1/2 years ago so I don't remember them. They involved factorials.

But in any case, I know the chances of getting 58 58s are much smaller than the chances of getting each of the numbers 1-58 without repeats. Start by taking the total number of possible combinations, which as you point out above is 58 to the 58th power. Of those possible combinations, only one satisfies the criterion of being 58 58s. On the other hand, there are numerous possible sets of numbers that satisfy the criterion of being each of the numbers 1-58. To give two, you could have 1-58 in order, or you could have 58-1. There's formula for figuring out exactly how many possible combinations there are such that each of the numbes 1-58 appear only once, but I completely forget it.

So, yes, the odds of getting 58 58s are the same as getting any other exact sequence of numbers, but not the same as any other set of numbers regardless of order

Posted by: Zach S. at December 15, 2005 10:42 AM

Let's see if I remember this right...
Chance of 58 58s = 1/58^58 (5E-103)
Number of possible combinations of 1 to 58 = 58! (factorial). So the chance of getting a particular one of them is 1/58! (4E-79).

Posted by: Jacob at December 15, 2005 10:54 AM

Wait; wouldn't it be the number of combinations that satisfy the 1-58 criteria divided by the number of total combinations? That is, 58!/58^58?

Posted by: Zach S. at December 15, 2005 11:05 AM

Hmmm.... It depends on what we're looking for.

1/58! is the chance of picking a particular sequence of numbers out of a set of sequences in which all the integers 1-58 are found but not repeated.

58!/58^58 is the chance of a random number generator spitting out any sequence of numbers 1-58 that don't repeat themselves (there are 58! such sequences out of 58^58 possible).

1/58^58 is the chance of a random number generator picking a set of numbers 1-58 twice (guaranteed to pick one the first time, then 1/58^58 to get it the second time).

Posted by: Jacob at December 15, 2005 11:26 AM

Wait...

I'm totally entering into a math world that I don't understand, so I'm not 100% sure how to formulate this question, but:

Say there are 58 places in the queue, and the number generator is generating values between 1 and 58.

The first time it generates a value, it knows there are 58 places to fill, and that it can fill them with any value from 1 to 58. So far so good.

Now depending on how it works, it seems like it could approach the second selection in one of two ways: First, it could consider that one of the 58 places has been filled, and that there are therefore 57 places left. Let's say that it picked the value "58" the first time. If it's thinking this way, the second time, wouldn't it need to take that value off the table and pick a value from 1 to 57? (Otherwise it runs the risk of picking a value greater than the number of remaining spaces, and we know that it can't do that?)

But it's clearly not thinking that way, because it can return "58" multiple times. So it's treating every new value generation as if it's filling 1 out of the 58 total spots, not the 57 or 56 or 55 etc. remaining spots.

My question is this: if it's a factorial, wouldn't that only be true if it's thinking the first way (with a diminishing number of possible spots)? It seems like the way it's actually thinking, it's not 58!, but 58 every time (I don't know how to represent that).

Oh christ, this is what happens when humanities people try to ask math questions.

Posted by: katie at December 15, 2005 01:02 PM

Alright, here's the trick. Assuming all possibilities are equally likely, the probability of getting some value equals the number of possible outcomes in which that value is reached divided by the total number of possible outcomes.

Take rolling dice. If you have two fair dice, there's only one way for them to roll a result that adds up to 2: 1 and 1. Similarly, there's only one way to get 12: 6 and 6. However, there are 6 ways to get 7: 6 and 1, 2 and 5, 3 and 4, 4 and 3, 5 and 2, and 6 and 1.

To figure out the number of possible outcomes for a series of independent randomly generated numbers, you raise the number of possibilities for one generation by the number of trials. So for the dice, there are 6 possible outcomes for a die roll, which you raise to the power of 2, for two rolls. So 36 possibilities.

Thus, the chances of rolling 2 when you roll a pair of dice are 1/36 (one possible outcome that adds up to 2 divided by 36 total possible outcomes). The chances of rolling a 7 are 6/36 (6 possible ways of rolling a 7 divided by 36 possible outcomes).

For our problem, there are 58 to the power of 58 possible outcomes (58 possible numbers generated with each trial, raised to the power of 58 trials). So 58^58 is the denominator.

The numerator is where the factorial comes in. We want to know how many ways there are that you could generate a number between 1 and 58 58 times and get one of each number between 1 and 58. For example, one way would be 1, 2, 3, ..., 58, in order. Or it could be 1, 3, 2, 4, 5, ..., 58. So to figure out how many possible ways you could get one of each number between 1 and 58, you need to figure out how many permutations there are of the numbers 1 through 58. That is, how many ways is it possible to arrange all the numbers between 1 and 58?

For mathematical reasons I don't understand, the number of possible permutations for X numbers is X!. For instance, take the numbers 1, 2, and 3. How many ways can you arrange them?
1, 2, 3
1, 3, 2
2, 1, 3
2, 3, 1
3, 1, 2
3, 2, 1
6 ways, which is 3!. You get the same result with the numbers 1-4, 1-5, and all the way up to 1-58.

So that's where the factorial comes in. It tells us how many possible ways the number generator could spit out a set of numbers that happens to contain each of the numbers from 1-58. That tells you 1. how many ways your queue could be re-ordered in the end, and 2. gives us the numerator for the probability of the number generator giving you a set of 58 unique numbers between 1 and 58. The end probability of the latter is 58! (number of permutations of 1-58) / 58^58 (total number of possibilities the number generator could spit out).

Posted by: Zach S. at December 15, 2005 01:46 PM

So, to answer your question, that's where the factorial comes from. But you are right that 58! would be the number possible outcomes if the random numbers were not generated independently, that is, if each time a number was drawn it was thrown out of the possibilities for future draws.

Posted by: Zach S. at December 15, 2005 01:50 PM
Cementhorizon