Skip to content
Reference

Letter Frequency in English

The reason ETAOIN SHRDLU is a phrase at all.

Est. 10 min read
Updated Reviewed

In 1938, Alfred Butts sat down with the front page of the New York Times and counted every letter. His count gave him the tile distribution for Scrabble (12 E's, 4 S's, 1 Q) and inadvertently produced one of the most cited letter-frequency tables in English.

Letter frequency isn't just historical trivia. It's the backbone of Scrabble strategy, Wordle openers, cryptogram solving, keyboard layouts and even natural-language compression. This guide covers the data, the different ways to measure it, and the practical uses.

The basic ranking

Ranked by percentage of letters in written English (across a broad newspaper / novel corpus):

RankLetterFrequency (%)
1E12.70
2T9.06
3A8.17
4O7.51
5I6.97
6N6.75
7S6.33
8H6.09
9R5.99
10D4.25
11L4.03
12C2.78
13U2.76
14M2.41
15W2.36
16F2.23
17G2.02
18Y1.97
19P1.93
20B1.29
21V0.98
22K0.77
23J0.15
24X0.15
25Q0.10
26Z0.07
Source: standard English corpus averages (Peter Norvig's letter-frequency study).

The top 8 letters (ETAOIN SH) account for ~65% of all written English. The bottom 6 (VKJXQZ) account for under 3%.

Frequency in dictionaries vs. running text

There's a subtle but important distinction. ‘E’ tops running text partly because ‘THE’ is so common. In dictionary word lists (each word counted once), the top letters change slightly — ‘S’ jumps up (because so many words start with S) and ‘E’ drops.

MetricTop 5 letters
Running textE, T, A, O, I
Dictionary word listE, S, I, A, R
First letter of wordsT, A, O, I, S
Last letter of wordsE, S, T, D, N

Scrabble tile distribution

Butts' 1938 count produced a tile distribution that closely mirrors running-text frequency, with adjustments for playability. There are 12 E's (most common letter) and only 1 Q and 1 Z (rarest).

LetterTilesPoint value
E121
A / I91
O81
N / R / T61
L / S / U / D41–2
G32
B / C / M / P / F / H / V / W / Y23–4
K / J / X15–8
Q / Z110
Blank20

Positional frequencies matter for Wordle

Wordle rewards not just common letters but common letters in common positions. ‘S’ is very common as a first letter, uncommon as a last letter (except for plurals, which the Wordle answer list excludes). ‘E’ dominates the last position.

PositionTop 3 letters (Wordle answer list)
1stS, C, B
2ndA, O, R
3rdA, I, O
4thE, N, S
5thE, Y, T

Uses beyond word games

  • Cryptography — frequency analysis was the earliest technique for breaking substitution ciphers.
  • Compression — Huffman coding uses letter frequencies to assign short codes to common letters.
  • Keyboard design — Dvorak's layout was designed around English letter frequencies.
  • Optical character recognition — priors on letter frequency improve OCR confidence.

Summary

  • ETAOIN SHRDLU: the 12 most common letters cover ~80% of English text.
  • ‘E’ leads running text; ‘S’ leads first-letter frequency.
  • Wordle rewards common letters in common positions, not just common letters.
  • Alfred Butts' 1938 letter-counting exercise still shapes Scrabble tile bags.

Frequently asked questions

Why is E the most common letter?

Vowels are frequent generally, and ‘E’ is used both alone (‘HE’, ‘SHE’, ‘BE’) and in common endings (‘-ED’, ‘-ES’, ‘-ER’).

Does frequency change between British and American English?

Slightly. British spellings like ‘-OUR’ (colour, favour) push up U counts marginally versus American -OR.

How was ETAOIN SHRDLU chosen?

It's the 12 most common letters in English text, arranged in frequency order. It was famously the first two columns of a Linotype keyboard.

Why is Q worth 10 in Scrabble?

Not just rarity — Q almost always needs a U to be playable, which makes it a difficult tile even at rarity 1 in 130.

References & further reading

Related articles