Last edited on 1997-12-30 03:41:48 by stolfi
This map shows the relative frequency of occurences of certain word patterns in each block of text. Listed here are all patterns with at least 10 occurrences in the whole sample text. Further explanations can be found in the introduction page.
The relative frequency of a pattern w in a block B is approximately the count N(w,B) of occurrences of w in B, divided by the total occurence count N(w) of w in the text, and scaled to the range [0..99]. A "." is printed if the word does not occur in the block.
More precisely, the relative frequency is computed as 99*(N(w,B)+1)/(N(w)+M) where M is the number of blocks (32). This formula attempts to reduce the effects of sampling noise.
There are three versions of the map, differing only on the way the entries are sorted:
The purpose of the last item above is to highlight the similarity between the distribution of a word with o- or y- prefix, and the same word without the prefix.