Measure the lexical diversity and vocabulary depth of your writing with TTR, hapax legomena, and a composite richness score.
Enter text to see distribution
Type-Token Ratio measures the proportion of unique words (types) to total words (tokens) in your text. A TTR of 70% means 70% of all words in your text are unique. Higher TTR indicates more diverse vocabulary, though it naturally decreases with longer texts as words inevitably repeat.
Hapax Legomena (Greek for "said once") refers to words that appear exactly once in your text. A higher proportion of hapax words suggests a richer, more varied vocabulary. In typical English prose, about 40-60% of unique words are hapax legomena.
Lexical density is the ratio of content words (nouns, verbs, adjectives, adverbs) to total words. It excludes function words like "the", "is", "and". Academic writing typically has higher lexical density (55-65%) while casual speech is lower (40-50%).
Read widely across different genres and subjects. Use a thesaurus to find synonyms for commonly repeated words. Practice writing with word variety in mind. Study domain-specific vocabulary for your topic. Avoid filler words and cliches that add no new meaning.
The composite vocabulary score (0-100) is a weighted blend of TTR (40%), hapax legomena ratio (30%), and lexical density (30%). The scale accounts for text length, as longer texts naturally have lower TTR. The result is normalized and mapped to four categories: Basic, Standard, Rich, and Very Rich.