If we are voting on missing letters I want thorn(þ). My understanding is that thorn is one of the rarer sounds in the worlds languages, and it deserves to get it's own letter back.
On the topic of screwball spelling is this video essay on silent letters. The fun takeaway for me was that a lot of silent letters were never pronounced. it is just that when some of the first dictionaries were being produced, and the spellings decided on, they decided to introduce silent letters to indicate the origin of the word. the b in debt is because it comes from the latin debitum. but it was not spelled that way until the 1500's prior to that it was dette.
It would be nice to revive all of the Old English letters (well, except for the wynn Ƿ because it's so easily confused with P).
"æ" even has the same obvious sound value still, it can even keep the name "ash"
"þ" and "ð" for the two th-sounds (the former unvoiced, the latter voiced).
And if "ᵹ" is readmitted specifically in its affricate capacity, i.e. for "g" in "gem" etc, then "j" could instead be used with the value it has in French, i.e. used to spell words like "measure" - this is one of the few English phonemes that doesn't have a definitive letter associated with it right now.
Always use "k" for that sound and repurpose c/q/x for something else, and we could ditch the digraphs completely.
I don't know how many of you need to read this but no one is stopping you from doing any of this. Putting aside cultural friction and being thought a bunch of quacks, you could just start doing these things. There's no real authority telling anyone that they can't despite what some people believe. Just do it!
It's in quantum superposition. Actually speaking it collapses the wavefunction around that particular speaker, although some prefer to believe instead that reality forks. ~
By the "g as in graphics" logic for "gif", for "jpeg" we have "j as in joint", "p as in photographic", "e as in experts", and "g as in group", so "je-feck"?
i am inclined to agree on all points, and while the second parenthetical would be useful in Contemporary English, i cannot help but note that "þ" and "ð" in Old English both represented the same phoneme (unlike in, say, Icelandic). all of the OE fricatives (/s/, /f/, /þ/) had predictable phonetic voicing depending on environment (voiced intervocalically, unvoiced elsewhere), but no phonemic distinction. in extant MSS, <þ> and <ð> are completely interchangable, mere stylistic variants of writing the phoneme, and even texts written by a single scribe will often have the same word written with both letters. some older MSS will use a plain <d> for the voiced sound (and a plain <b> for the voiced /f/). i've seen at least one MS that used <th>. in Latinized versions of OE names, it's not uncommon to see <th> or <d>, but Classical OE spelling didn't generally distinguish the voiced or unvoiced.
even in Contemporary OE, there are very few minimal pairs between /þ/ and /ð/ ("thigh" vs. "thy" comes to mind, but not much else apart from rare noun/verb combos like "loath" vs. "loathe"). it could be argued that we don't really need both, but the (surely obvious by now) pedant in me desires both. saying "boð" rather than "boþ" doesn't strictly change the meaning, and is not likely to cause confusion, but it sure sounds off!
Well, we get those two letters "for free" anyway, and the phonemes are distinct in modern English, so let's use them!
"ð" also has the nice property of being sufficiently similar in looks to "d" that someone reading it and not familiar with the letter might pronounce it as "d" and thereby make a decent approximation. Unfortunately that's not the case with "þ" though. If we ignored the historical letters and instead just picked something reasonably sensible, I'd probably go with "đ" and "ŧ" for this reason.
i've had people mistake "ð" for an "o" with a diacritic, but presumably that's a typeface problem. whatever we choose, i insist that we drop the names and call them "eþþ" and "ðee", by analogy with our other fricatives ;)
The voiceless alveolar lateral fricative[1] is another fun one. The only european languages that have it are Welsh and Icelandic where it's simply written "ll" and "hl". In some medieval texts it had a dedicated letter "Ỻ" or "ỻ".
Here's a fun thing you can do with AI, here ChatGPT:
"I speak a number of languages, but I'm not a linguist. How could I get a list of sounds that aren't included in the languages I speak? Would be interesting to hear what sounds I'm not used to hearing"
<Long blurb about what a great idea I had, and an intro to how to do it>
If you’d like, you could just tell me the languages you speak, and I can give you a first-pass list of major sound types you’re probably missing (like clicks, ejectives, tones, pharyngeals, etc.).
Would you like me to sketch a concrete example with your languages?
"Sure, here's my list: <my list>"
And then it gives a list of sounds you know, and sounds you don't. Pretty cool. It even has links to a sound map site so you can hear them.
If you like, I could generate a concrete list of IPA symbols that are not in <my list> — a sort of “negative phoneme inventory”. Would you like me to build that for you?
"Yes please. Also what additional language would cover the most ground?"
One language that covers the most new ground
Amharic (Ethiopian Semitic) is a superb single add-on because it gives you, in one go:
• Ejectives (tʼ, kʼ, qʼ, sʼ/tsʼ),
• Pharyngeal consonants (ħ, ʕ),
• Uvular stop (q), and
• True gemination (contrastive long consonants).
If you want a two-language combo to “max out” the world’s rarities, add Zulu (or Xhosa) for clicks alongside Amharic.
About silent letters, French did the same, but one of the unintended consequences was to change pronunciation of many words when reading became common in the XIXth century. For instance "admirer, admiration" had a silent d before ~1830, which is now pronounced by everyone. Ditto the "ir" termination of verbs, the "r" was silent, that's why old songs have strange rhymes such as "Compère Guilleri" rhyming with "te lairas-tu mourir", and "les lilas sont fleuris" rhyming with "qu'il fait bon dormir".
This is just a feature of Castilian Spanish: https://en.wikipedia.org/wiki/Phonological_history_of_Spanis...
\th\ only occurs naturally in like 5% of the thousands of human languages that have ever existed. Just because those languages are some of the most widely-spoken ones worldwide does not make the sound a commonly-occurring one in a meaningful phonological sense.
I know Japanese does not have a th sound, and I don't think chinese or most other asian languages have it, but am less sure about that. Unfortunately I lack the data needed to substantiate my claim.
with
lang_sounds as (
select
lang,
unnest(string_to_array(ipa, null) ) as sound
from world_dictionary
),
totals as (
select
lang,
count(sound) as sound_count
from lang_sounds
group by lang
)
select
lang,
totals.sound,
count(sound) / totals.sound_count
from
lang_sounds join
totals on
lang_sounds.lang = totals.lang
where sound = 'θ' or sound = 'ð' or sound = 'θ̠' or sound = 'z'
group by lang, sound
order by count(sound) / totals.sound_count
https://phoible.org/parameters has the data you seek: 5% of languages in the database have eth (ð) and 4% have theta (θ). Z is not a 'th' sound and fairly common at 30% of languages, though.
> I know Japanese does not have a th sound, and I don't think chinese or most other asian languages have it, [...]
There's no single th sound in English. There's a few different sounds you get from that letter combination in different words (and in different dialects).
It’s true. English and the main Spain version of Spanish are two of the few languages in the world which have the sound. Even most Latin American versions of Spanish (maybe all?) do not have it.
In "distinción" spanish, the classic pair is the word for house and for hunt - "casa" and "caza" respectively. If you pronounce them the same (with an S sound), you're a Seseo speaker like (most) latin america. If you pronounce them with different sounds, one an S sound, the other a TH sound, you're a "distinción" speaker, and if you pronounce them both with a TH sound, it's the more uncommon ceceo accent, usually largely Andalusian.
I stand corrected. All of those seem loanwords, but you are absolutely right.
My main point is that if you hear the sounds "th" (za) with e or i 99% it is cena or circo and will not be zena or zirco. It is an heuristic but very reliable.
My favourite word to troll people who are learning the language is "cerrojo" /θe'roxo/, meaning "latch" or "lock", as it contains the three most difficult consonants in the language in sequence xD
I was talking about spelling. I can clearly see how these clusters of consonants characteristic of all Slavic languages can be a pain for a beginner, no matter how you spell them.
No Cyrillic imports, thank you. Russia decided to be the bully and murderer of its closest neighbors, don't need any more russian influence, even if literally just on paper. The further one is from them the more safety and prosperity there is, in every possible way.
What I found helpful in parsing those z combinations is just replacing them with h instead. for example, if you went up to a random monolingual English speaker and showed them "shchotechka" they could probably pronounce it reasonably well. All those z's just throw people off.
> My understanding is that thorn is one of the rarer sounds in the worlds languages,
Not as rare as the lateral fricative, the "ll" in Welsh and "hl" in Southern African languages. e.g. in the place names Llandudno in Wales and Hluhluwe in KwaZulu.
And then there's the clicks in Southern African languages. Which are usually written as a Q not followed by a u. e.g. in the place name Gqeberha or the Mbaqanga music style.
> And then there's the clicks in Southern African languages. Which are usually written as a Q not followed by a u. e.g. in the place name Gqeberha or the Mbaqanga music style.
<q> is only one of the clicks in Bantu languages! The letters <c> and <x> are used for other kinds of clicks too (for dental and lateral clicks, respectively).
And then of course there are the Khoisan languages, which use a completely different set of click letters: <ʘ ǀ ǁ ǃ ǂ>.
I work for a small remote company with employees all over Europe; it's pretty common for most non-native English speakers to pronounce it this way. Especially noticeable with Germans.
Interesting; I don't recall hearing it pronounced like that from my Polish colleagues, so perhaps it's something about the Germanic languages specifically?
Who's judging? I'll save that for when my Swedish är lite mindre dålig! It's just a little strange that the "b" somehow transforms into a "p" when "b" is a perfectly common letter in Swedish. If they just pronounced the "b" I wouldn't have thought it at all notable.
It depends on whether you define "rare" in terms of language variety or human variety, obviously. In terms of languages, it is a relatively rare phoneme. It occurs more often as an allophone of other phonemes, but in that case the speakers may not be able to distinguish it and will struggle to reproduce it in "unusual" environments.
On the topic of screwball spelling is this video essay on silent letters. The fun takeaway for me was that a lot of silent letters were never pronounced. it is just that when some of the first dictionaries were being produced, and the spellings decided on, they decided to introduce silent letters to indicate the origin of the word. the b in debt is because it comes from the latin debitum. but it was not spelled that way until the 1500's prior to that it was dette.
https://www.youtube.com/watch?v=NXVqZpHY5R8 (RobWords: Why English is full of silent letters)