FAQ  •  Login

Level 21

Moderators: UncleTimmy, mjpieters

Enjoying the challenge? Need a hint?
Make a donation and help keep the site running! -thesamet
<<

UncleTimmy

Posts: 118

Joined: Thu May 05, 2005 3:44 pm

Post Wed May 11, 2005 10:30 pm

Re: Chi-square

jtauber wrote:zip, gzip and bzip2 all fail to compress it. That's pretty tightly packed :-)


Bingo! Compression and macro statistical randomness go together like Python and joy :wink:. I have to quit for the night, so it's up to you to examine the 500 places this might lead :wink:.
<<

dfsmith

Posts: 21

Joined: Fri May 06, 2005 10:50 pm

Location: Silicon Valley

Post Wed May 11, 2005 10:33 pm

Re: Chi-square

UncleTimmy wrote:The chi-square calc is fine -- it's about 304 with 255 d.f. This is larger than the mean, but about 1.8% of truly random sequences would exceed that chi-square stat.


I see your problem: there is only 1 d.f. (is it a 255 or not a 255).
This gets me a chi2 of 24.0 (23.9+0.1); i.e., about 1:1000000. (Not far off my quick calculation.)
<<

jtauber

Posts: 47

Joined: Sun May 08, 2005 10:20 pm

Post Wed May 11, 2005 10:44 pm

Re: Chi-square

UncleTimmy wrote:I have to quit for the night, so it's up to you to examine the 500 places this might lead :wink:.


Actually, it might only lead to 7 places. Or one place 7 times :wink:
<<

UncleTimmy

Posts: 118

Joined: Thu May 05, 2005 3:44 pm

Post Wed May 11, 2005 10:55 pm

Re: Chi-square

dfsmith wrote:
UncleTimmy wrote:The chi-square calc is fine -- it's about 304 with 255 d.f. This is larger than the mean, but about 1.8% of truly random sequences would exceed that chi-square stat.


I see your problem: there is only 1 d.f. (is it a 255 or not a 255).
This gets me a chi2 of 24.0 (23.9+0.1); i.e., about 1:1000000. (Not far off my quick calculation.)


Goodness no. This a 256-bin problem. I never cared about 0xff specifically, that's just one of the 256 bins being analyzed simultaneously. The expected count (if the distribution across bins were uniformly random) in each bin is

  Code:
e = len(thefile)/256.0


The chi-square stat is
  Code:
sum((bincount[i] - e)**2 / e for i in range(256))


If this isn't clear, look on the web for explanations of how to use chi-square to test, e.g., whether dice are loaded. Then picture a 256-side die :wink:
<<

jtauber

Posts: 47

Joined: Sun May 08, 2005 10:20 pm

Post Wed May 11, 2005 10:59 pm

One way then another way then the first way then the second way. Back and forth. 7 times then 4 times then 11 times then 4 times...

After 6 of these pairs I end up with something I can no longer do anything to. Unless I do it b...
<<

UncleTimmy

Posts: 118

Joined: Thu May 05, 2005 3:44 pm

Post Wed May 11, 2005 11:05 pm

Re: Chi-square

dfsmith wrote:I see your problem: there is only 1 d.f. (is it a 255 or not a 255) ...


Ah! I think I see the real :wink: problem here. My original "chi square" message was not a response to your message about the distribution of 255, but you thought it was. It looks like we were writing those messages at about the same time, and I didn't see yours until after I posted mine. So I haven't been making any claims about your analysis of the distribution of 255. I'm making an independent claim of a different kind. The oddball rarity of 255 specifically does indeed boost the 256-bin chi-square stat, but not enough to make the overall distributiion extremely unlikely.
<<

jtauber

Posts: 47

Joined: Sun May 08, 2005 10:20 pm

Post Wed May 11, 2005 11:17 pm

jtauber wrote:One way then another way then the first way then the second way. Back and forth. 7 times then 4 times then 11 times then 4 times...

After 6 of these pairs I end up with something I can no longer do anything to. Unless I do it b...


Forwards and backwards, forwards and backwards. One way then another way then the first way then the second way.

I was exhausting but I finally hit the bottom.
<<

dfsmith

Posts: 21

Joined: Fri May 06, 2005 10:50 pm

Location: Silicon Valley

Post Wed May 11, 2005 11:20 pm

Re: Chi-square

UncleTimmy wrote:but not enough to make the overall distributiion extremely unlikely.


...unless of course, all the other values are slightly closer to the expected than they ought to be. (The problem with chi square---it's a sum, and downplays outliers. It would be fun to try a geometric sum.) Anyway, this has been a fun rant! Let's have another one at level 22.
<<

johnlr

Posts: 41

Joined: Thu May 05, 2005 4:17 pm

Location: kiwi in aussie

Post Wed May 11, 2005 11:23 pm

This file is much bigger than any others we have seen, Even the sound file was much smaller. I think it's too big to be a pictoanswer...maybe it's a steno and the answer is just a small part
:?
<<

jtauber

Posts: 47

Joined: Sun May 08, 2005 10:20 pm

Post Wed May 11, 2005 11:25 pm

btw, i think it's obvious what game he played as a kid.
<<

UncleTimmy

Posts: 118

Joined: Thu May 05, 2005 3:44 pm

Post Wed May 11, 2005 11:34 pm

jtauber wrote:btw, i think it's obvious what game he played as a kid.


Me too -- although it was obvious long before this level :wink:.
<<

jtauber

Posts: 47

Joined: Sun May 08, 2005 10:20 pm

Post Wed May 11, 2005 11:36 pm

so, is anyone else stuck on where to look for their l__s?
<<

dfsmith

Posts: 21

Joined: Fri May 06, 2005 10:50 pm

Location: Silicon Valley

Post Thu May 12, 2005 2:02 am

Dang. Didn't take any l__s. Have to go make some again.
<<

jtauber

Posts: 47

Joined: Sun May 08, 2005 10:20 pm

Post Thu May 12, 2005 2:05 am

dfsmith wrote:Dang. Didn't take any l__s. Have to go make some again.


If you're talking about what I think you're talking about, I took those but am not sure what the pattern is or how to use it.
<<

jtauber

Posts: 47

Joined: Sun May 08, 2005 10:20 pm

Post Thu May 12, 2005 2:12 am

Not sure if it's the sequence [7, 7, 12...]
or the more detailed [[(7, 4), (11, 4), (7, 9)...], ...]
PreviousNext

Return to Python Challenge Hints

Who is online

Users browsing this forum: Bing [Bot] and 12 guests

Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group.
Designed by ST Software for PTF.