Is Morse really built around the most popular letters in English?

3 Views· 07/22/23
Foundations of Amateur Radio
0

Foundations of Amateur Radio Thanks to several high profile races we already know that sending Morse is faster than SMS. Recently I started digging into the underpinnings of Morse code to answer the question, "Can you send Morse faster than binary encoded ASCII?" Both ASCII, the American Standard Code for Information Interchange and Morse are techniques to encode information for electronic transmission. One is built for humans, the other for computers. To answer the question, which is faster, I set out to investigate. I'm using the 2009 ITU or International Telecommunications Union standard Morse for this. Morse is said to be optimised for sending messages in English. In Morse the letter "e", represented by "dit" is the quickest to send, the next is the letter "t", "dah", followed by "i", dit-dit, "a", dit-dah, "n", dah-dit, and "m", dah-dah. The underlying idea is that communication speed is increased by making the most common letter the fastest to send and so-on. Using a computer this is simple to test. I counted the letters of almost 400,000 words of my podcast and discovered that "e" is indeed the most common letter, the letter "t" is next, then "a", "o", and "i". Note that I said "letter". The most common character in my podcast is the "space", which in Morse takes seven dits to send. Also note that the Morse top-5 is "etian", the letter "o" is 14th on the list in terms of speed. In my podcast it's the fourth most popular letter, mind you, my name is "Onno", so you might think that is skewing the data. Not so much. If I use the combined works of Shakespeare, and given that it represents an older and less technical use of language, and doesn't feature my name, I figured it might have a different result. The top-5 in his words are "etoai", the letter "o" is the third most popular, and "space" still leads the charge, by nearly 3 times. I also had access to a listing of 850 job advertisements, yes, still looking, and the character distribution top-5 is "eotin", the letter "o" is the second most popular letter. Because I can, and I'm well, me, I converted the ITU Morse Code standard to text and counted the characters there too. The top-5 letters are "etion", but the full stop is a third more popular than the letter "e", mind you that might be because the people at the ITU still need to learn how to use a computer, seriously, storing documents inside the "Program Files" directory under the ITU_Admin user, what were you thinking? I digress. The "space" is still on top, nearly six times as common as the letter "e". As an aside, it's interesting to note that you cannot actually transmit the ITU Morse standard using standard Morse, since the document contains square brackets, a multiplication symbol, asterisks, a copyright symbol, percent signs, em-dashes, and both opening and closing quotation marks, none of which exist as valid symbols. Back to Morse. The definition has other peculiarities. For example the open parenthesis takes less time to send than the closing one, but you would think that they are equally common, given that they come in pairs. If you look at numbers, "5" takes the least amount to send, "0" the longest. In my podcast text "0" is a third more common than "1" and "9" is the least common. In Shakespeare, "9" is the most common, "8" the least, and in job listings, "0" and "2" go head-to-head, and both are four times as common as the number "7" which is the least common. All this to say that character distribution is clearly not consistent across different texts and Morse is built around more than the popularity of letters of the alphabet. For example, the difference between the left and right parenthesis is a dah at the end. If you know one of the characters, you know the other. The numerical digits follow a logical progression from all dits to all dahs between "0" and "9". In other words, the code appears to be designed with humans in mind. There are other idios

Show more

 0 Comments sort   Sort By


Up next