4
Dictionary
nDownsampled from larger versions
The dictionaries I am using have been downsampled from a large dictionary that includes approximately 25,000 words of frequency greater than one taken from a corpus of over 6 million words.  To do this you have to more or less reconstitute the corpus and resample a smaller number of words.  This process preserves the frequency distributions.  Various problems arise if this is not done, such as inconsistencies when the dictionary is trained and correction results that aren’t comparable across dictionary sizes.