🔠 Missing words in the Mozilla English dictionary

Copyright © 2022 Teal Dulcet

Missing words in the Mozilla (Firefox and Thunderbird) American English (en-US) language spellchecking dictionary. The code for this page and the scripts to generate the data are 100% open source and on GitHub.

The Mozilla English dictionary is based on the SCOWL dictionary, which is also used by Chromium, LibreOffice and many other open source projects, but unfortunately, it is missing many common words found in other proprietary dictionaries. I created this page in an effort to help Mozilla developers and other Mozillians systematically find those missing words that should be included in the Mozilla and other spellchecking dictionaries.

🙋 If you see one or more words on this page that you believe should be included in the Mozilla en-US dictionary, please consider adding a comment to Bug 1811451 with the word(s) and Wiktionary and/or Wikipedia links, as well as a link to another online dictionary (e.g. Merriam-Webster or Oxford) if possible. Please feel free to also create an issue if there are any improvements to this website or the data that would help you find more missing words.

Some of the words below may be offensive or otherwise unsuitable for inclusion in the Mozilla dictionary, but there are still a lot of good candidates to consider…

Options

What types of words to show?
* The hyphen and apostrophe are always allowed.

ℹ️ This page may briefly slowdown after changing these options and it may take up to a minute to fully load, especially when showing a large number of words.

Wiktionary

Uses the English Wiktionary dictionary data. It is created from the Wiktionary dumps, which is converted to a machine-readable format by kaikki.org using their open source Wiktextract tool. See the Wiktextract paper for more information.

I wrote a script to convert this raw JSON Lines data into a simple TSV file and then remove all the words already in the Mozilla dictionary. Words and forms with any whitespace characters are excluded, as well as all words from several parts of speech categories, including names and phrases. British, Canadian and Australian English spellings/variants were also excluded, as well as words with several tags, including obsolete and misspelling. Any words that are identical after they are normalized by removing any non-alphanumeric characters and converting them to lowercase are put on the same row.

This data is automatically updated monthly, to reflect changes made to the Mozilla dictionary and to Wiktionary. If users notice any errors in the data, they should correct them directly on the Wiktionary website and it will automatically be included on this page in the next monthly update.

Wiktionary is licensed under both the Creative Commons Attribution-ShareAlike 3.0 Unported License (CC BY-SA 3.0) and the GNU Free Documentation License (GFDL).

⬇️ Download the full TSV file: Wiktionary words.tsv

Other dictionaries

Other dictionaries for consideration ordered from high to low quality:

  1. Ispell small and medium American English dictionaries - 13,997 words, see Bug 1811451 comment 2.
  2. LibreOffice Technical dictionary - 269 words, see Bug 1808872 comment 30.
  3. Chromium/Chrome en-US dictionary - 412 words, see Bug 1808872 comment 29.
  4. Google Ngram American English 1-grams data - top 100,000 words, see Bug 1808872 comment 28.

These dictionaries are changed infrequently, so the resulting lists do not need to be updated automatically, but they therefore might still include some words that have been already added to the Mozilla dictionary.

🙋 Please let me know if you know of any other free and open source high quality American English dictionaries or wordlists.