MUSINGS Concerning relative vowel and consonant frequencies in the OSPD3, and conclusions that may be drawn about rack balance therefrom. ÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜ This is not a formal research piece. It is easy enough to compile statistics on letter frequencies in the OSPD3 using the computer, but drawing useful conclusions from these statistics is another matter altogether. With that disclaimer out of the way, let us begin. Let us call the proportion of vowels to total letters in a given word list, expressed as a percent, V%. For all the words in the OSPD3, V% = 38.56, that is, vowels make up 38.56% of all the letters in all the words in the OSPD3. Similarly, V% for the 1254 words newly added to the OSPD3 is 45.32. The words added to the OSPD3 have a rather higher proportion of vowels to consonants than does the typical run of words already present there. Taking V% for a supplementary list of 133,282 words longer than eight letters, we get 41.73. A tentative conclusion is that lists of very short words and very long words have a higher V% than lists of intermediate length words. This can be tested. Calculating V% for the OSPD3 according to words of given length, we get the following figures: 2-letter words: 53.57 3-letter words: 41.99 4-letter words: 38.92 5-letter words: 38.66 6-letter words: 39.14 7-letter words: 38.75 8-letter words: 38.91 9-letter words: 37.30 10-letter words: 38.82 11-and above letter words: too few words in OSPD3 to analyze meaningfully. It appears that V% does indeed settle down to a figure in the range of 38 - 39 for 4 to 10 letter words. What does this mean in terms of rack balance and playing strategy in a real world Scrabble (tm) game? It would appear that a balanced rack (7-letters) should have about 3 vowels and 4 consonants (43%, the closest approach to a V% of 38-39). Of course this is scant consolation if you have a "balanced" rack of VWXZUUU. The trick is having the *right* consonants and vowels, and less critically the relative proportion. ============================================================================= Words in the English language, and OSPD, are "random"* in the sense defined by the mathematician John Casti. This means that words cannot be reliably constructed by a formula or algorithm. For example, given the set of consonants, C{ b, c, d, f, g, h ... } and the set of vowels, V{ a, e, i, o, u, y }, try to find a method of creating English words, say by taking 3 from set C and 2 from set V. This approximates the V% found above. Most of the "words" formed by trial and error by this 3-to-2 rule will form strings of letters not found in any English language dictionary, nor in the OSPD3, non-words in other words. Casti defines a "random" number as a real number whose shortest representa- tion is itself. By the same token, I would say a "random"* word is likewise one whose simplest representation is itself. Therefore, =all= the words in the English language, and the OSPD, are "random". There is no mathematical formula for constructing words in any spoken / written language. This gives natural human languages their richness, complexity, diversity, and unpre- dictability. Ain't language wonderful, hon! footnote: -------- *You could also make a case that words are "chaotic" rather than "random", that is, falling into a pattern, but not one that is predictable or computable. ============================================================================ Addendum: A couple of interesting "imbalanced" vowel / consonant word lists. By increasing length, champion "imbalanced" words: HIGH-CONSONANT LIST crwth crwths tsktsks borschts strengths throttling abstractest backdropping scratchbrush (not in OSPD3) HIGH-VOWEL LIST aalii euouae (not in OSPD3) yautia ouguiya aboideau zoogloeae homoiousia (not in OSPD3) squeegeeing housesitting ÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜÜ Scrabble and OSPD are trademarks of the Milton Bradley Co., Inc. The above musings are the product of the demented mind of the author of the SCRABLST,WAK, and WORDY packages. M\Cooper PO Box 237 St. David, AZ 85630-0237 ------------------------------------------------ E-mail: thegrendel@theriver.com Web: http://personal.riverusers.com/~thegrendel/