tag 标签: morse

相关博文
  • 热度 24
    2015-8-28 22:13
    1454 次阅读|
    0 个评论
    Continued from Comparing techniques for storing Morse code in C (Part 1) Storing symbols as bytes The overall architecture and flow of this program is very similar to the string version (you can access the entire program by clicking here ). The first difference occurs when we declare our dotsNdashes = {B01000010, // 0 = A ".-"                       B10000001, // 1 = B "-..."                       B10000101, // 2 = C "-.-."                       B01100001, // 3 = D "-.."                        :                       etc.   As we discussed in the overview, the three MSBs contain the number of dot-dash symbols (the two MSBs in the case of the punctuation characters), while the LSBs contain the symbols themselves; 0 = dot and 1 = dash, with the symbols organized in reverse order such that the first symbol appears in the LSB.   Our loop() function is identical to the previous example: void loop() {   displayThis("Hello World"); }   Our displayThis() function appears to be identical to the previous example -- it's word-for-word identical -- but there is a subtle difference regarding the way in which the compiler handles things. Consider the following statement from the displayThis() function:      flashLED(dotsNdashes );   In the case of our previous "store as strings" program, this statement calls our flashLED() function and passes it a pointer to a string of characters as an argument. By comparison, in the case of our "store as bytes" program, the argument is a single char (which we end up treating as a byte): void flashLED(byte dNdByte) {    :   // Lots of clever stuff    : } The next big difference between our two programs occurs in the body of our flashLED() function. First of all, we extract the number of symbols forming this character (the three MSBs), shift them five bits to the right, and store this value in a variable called "count":   byte count; count = (dNdByte 5) B00000111; if (count = 6) count = B00000110; Since byte is an unsigned data type, shifting it to the right should theoretically cause 0s to be shifted into the MSBs, but I don’t like to take any chances, so -- following the shift -- the " B00000111" (bitwise AND) operation forces the five MSBs to be 0s. Now, remember that the coding scheme we're using looks like the following: // 000 xxxxx 0 symbols (not used) // 001 xxxx? 1 symbol // 010 xxx?? 2 symbols // 011 xx??? 3 symbols // 100 x???? 4 symbols // 101 ????? 5 symbols // 11 ?????? 6 symbols Generally speaking, we have three count bits and five data bits. If the count indicates six symbols, however, then we have only two count bits and six data bits. In this case, the original bit 5, a copy of which now finds itself in the LSB of count, could be a 0 or a 1. This explains the statement " if (count = 6) count = B00000110;", which ensures that a count of six or seven is forced to be six, because a value of seven means that the LS bit of the count contains an unwanted data bit of 1. Now, I have to admit that I ran into a bit of a problem with the following, which is the part where we actually extract and display the dots and dashes: for (int i = 0; i count; i++) {    digitalWrite(pinLED,HIGH); // LED On   if (dNdByte B00000001 == 0) // It's a dot     delay(elementDelay / WPM); // Dot delay   else // It's a dash     delay((elementDelay * 3) / WPM); // Dash delay   dNdByte = dNdByte 1;   digitalWrite(pinLED,LOW); // LED Off   delay(elementDelay / WPM); // Inter-symbol delay } When I executed this and observed my flashing LED, I realized that I only appeared to be seeing dashes. This struck me as being a bit strange, so I stuffed a lot of Serial.print() statements into my program to determine what was going on. Here's what I saw in the Serial I/O window: As you see, all I'm generating is dashes. Eventually I focused my beady eye on the following statement:   if (dNdByte B00000001 == 0) // It's a dot     delay(elementDelay / WPM); // Dot delay   else // It's a dash     delay((elementDelay * 3) / WPM); // Dash delay In particular, I considered the "== 0" logical comparison. I couldn’t see how this could be wrong, but when I changed it to read "== 1", I then saw the following results. "Well, strike me with a kipper," I thought, "that's strange and no mistake." Now we are seeing dots and dashes, but they are inverted; i.e., a dot is displayed as a dash and vice versa. In fact, this is what we'd expect because we've inverted our test from "== 0 " to "== 1". But, in that case, why didn’t the "== 0" do what we expected in the first place? I tell you, I stared and stared at this, but nothing came to mind, so eventually I called my chum Ivan Cowie over from the next bay to take a look. Ivan suggested taking my statement:   if (dNdByte B00000001 == 0) // It's a dot And adding parentheses as follows:   if ((dNdByte B00000001) == 0) // It's a dot So we did so, and everything worked as expected. "Well, blow me down with a feather," I thought, "who would have thunk?" I personally would have assumed that a bitwise operator like , |, or ^ would have a higher precedence than a logical comparison (relational) operator like == or !=, but it appears that this is not the case (see also this list C++ operator preferences ). It took me a while to wrap my brain around what was happening here. This thing is that when we use an "if ( {condition is true} ) {do this}" type statement, then "true" equates to 1 and "false" equates to 0. If the "==" comparison is performed first, then "1 == 0" returns 0, and "dNdByte 0" also returns zero, which is seen by the "if" statement as "false", so it appears that we only ever have dashes. But what happens if we change our comparison to "1 == 1"? In this case "1 == 1" returns 1, so "dNdByte 1" will return 0 or 1 depending on the contents of the LSB, but a 1 will be seen as meaning "true", which equates to a dot, all of which explains why "== 1" does end up giving us dots and dashes, but swapped over. Phew! I tell you, I learn something new every day. So, we now have two programs offering two different ways of storing and manipulating our Morse code symbols -- how do they compare?   Comparison (code size and execution speed) The easiest comparison to make is that of code size and the amount of memory required to store any global variables, because this data is automatically reported at compile time. Not surprisingly, storing our dot-and-dash symbol data as bytes consumed less space for the global variables as illustrated in the table below. The interesting point to me is that the "storing as bytes" approach also resulted in a significant reduction in code size, even though it involves more processing when it comes to extracting the count (number of symbols) value and then extracting the dots and dashes one at a time. I'm not sure, but I'm assuming this is due to the fact that the "storing as strings" technique involves a lot of pointer de-referencing. Having said this, I really don’t have a clue, and I would welcome your thoughts on this matter. Now, as we previously noted, transmitting Morse code a human-readable speeds results in the processor sitting around twiddling its metaphorical thumbs for a large portion of the time (observe the delay() function calls scattered throughout the two programs like confetti). On this basis, we really don’t care how efficient our two approaches are. On the other hand, it's always interesting to know the nitty-gritty facts, so I decided to investigate. First of all, I commented out all of the delay() function calls, because these would totally overwhelm the amount of time spent doing the real work. Next, I declared two unsigned long variables called "time1" and "time2", and then I modified the loop() function in both programs to read as follows: void loop() {   time1 = micros();   displayThis("Hello World");   time2 = micros();   Serial.print("Time = ");   Serial.println(time2 - time1); } FYI, I'm using an Arduino Uno for this project and micros() is a built-in function provided by the Arduino IDE. This function returns the number of microseconds since the Arduino began running the current program. This number will overflow and return to zero after approximately 70 minutes, which is more than enough time for my proposes. Also, in 16 MHz Arduinos like the Uno, this function has a resolution of four microseconds (i.e., the value returned is always a multiple of four). I started by comparing my original "Hello World" strings, but then I thought that this really wasn't fair because some of the time was being spent converting lowercase characters to their uppercase equivalents, which isn’t really the point of the exercise. Thus, I also ran the comparison using strings comprising all of the alphanumeric characters:   ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789 And, just to make sure, I performed a third comparison using a longer string comprising two copies of all the alphanumeric characters. The results of all these comparisons are shown in the table below. As we see, the "storing as bytes" approach ends up being faster than its "storing as strings" equivalent, even though it involves more processing when it comes to unpacking the count and symbol data. Once again, I'm assuming that this is due to the fact that the "storing as strings" technique involves a lot of pointer de-referencing. And, once again, I really don’t have a clue, and I would welcome your thoughts on this matter. Now, with regard to the other techniques folks suggested -- like the binary tree and the run-length encoding -- it would be great if you wanted to take my original "store as strings" program as the basis and modified it to use these other techniques. If you then emailed me your code at max.maxfield@ubm.com , I could run it on my Arduino and compare it against the two schemes discussed here.   In the meantime, should I go with "storing as strings" approach because this is easier to understand and maintain, or should I opt for the "storing as bytes" technique on the basis that the end user doesn’t need to know what's going on under the hood, and this method is slightly faster and -- more importantly -- uses less memory?
  • 热度 14
    2015-8-17 17:22
    2071 次阅读|
    0 个评论
    Several days ago I posted this blog describing how I wish to use a light-emitting diode LED to flash messages in Morse code. In these columns I inquired as to the best way to represent, store, and access the Morse code dots and dashes used to represent letters and numbers.   As you can see from the image above, the alphanumeric characters 'A' to 'Z' and '0' to '9' each require between one and five dot-dash symbols. What I neglected to mention was that I also wish to have two punctuation symbols at my disposal, each of which contains six symbols: ',' (comma = "--..--") and '.' (period = ".-.-.-").   Of course, you might say: "Does anyone still know Morse code these days?" Well, my chum Ivan from the next bay just wandered into my office, looked at a little proto-shield I threw together last weekend flashing away merrily on my desk and said "Oh, it's flashing 'Hello World'" (take a look at this video of the proto-shield doing its funky stuff).   Alternatively, you might ask: "Does anyone actually need to know Morse code these days." To these sad souls I would respond: "Hello! Didn’t you see the film Independence Day ?" Then, as they sheepishly sidled away, I would call out: "Go away, or I shall taunt you a second time!"   But we digress... My original columns unleashed a maelstrom of comments. Community member Cdhmanning suggested using a binary decision tree, in which each node has a character and a link to the left for dot and one to the right for dash. I'd love to try this, but I fear it's beyond my limited programming skills. In fact, in one of his comments, Betajet presented the code for this; however, I'm still trying to wrap my bumbling brain around this little scamp (the code, not Betajet).   Someone else emailed me proposing a form of run-length encoding. Sad to say, however, I couldn’t wrap my brain around that one either ("I'm a bear of little brain," as Winnie-the-Pooh would say).   Several people advocated using two bits per symbol. My chum David Ashton, for example, suggested 01=dot, 10=dash, 11=space, and 00=end. David also noted that, using this scheme, a 5-symbol code would require 20 bits (5 dots/dashes x 2 bits, plus 4 inter-symbol spaces x 2 bits, plus 1 end character x 2 bits).   Actually, just a few minutes ago whilst I was in the process of writing this column, I received an email from Djones A. Boni in Brazil, who suggested the following scheme, which is very similar to David's:   We actually have four different symbols: The dot (+1 implicit space between parts) The dash (+1 implicit space between parts) The space between letters (-1 implicit space between parts) The space between words (-1 implicit space between parts) We can encode them as 0b00, 0b01, 0b10 and 0b11, thereby getting four symbols into a byte.   Another community member, Jacob Larsen, suggested using 0 and 1 bits to represent units of light and darkness, so the character 'A' (".-") would be represented as 10111; i.e., dot (1) = LED On = one unit delay; inter-symbol space (0) = LED Off = one unit delay; and dash (111) = LED On = three unit delays. On the one hand, this has the advantage of being relatively simple; on the other hand, the numeric character '0' with five dashes ("-----") would occupy 5 x (3 + 1) = 20 bits, which seems a tad enthusiastic, as it were.   Another proposal from Cdhmanning was to store each Morse code character in an 8-bit byte. The three MSBs (most-significant bits) would be used to represent the number of bits in the character, while the five LSBs (least-significant bits) would be used to represent the dots and dashes.   Consider the character 'F' ("..-."), for example. This could be represented as 100x0010, where the 100 indicates that this character contains four dot-dash symbols and the 0010 indicates dot-dot-dash-dot. Meanwhile, the 'x' indicates that we don’t care about this character (we would set it to 0 in the code).   In fact, Cdhmanning, actually recommended that the dot-dash symbol bits be "swapped over" such that the first to be transmitted is stored in the LSB, because this will make any downstream processing easier. In this case, our 'F' character would actually be stored as 100x0100.   As I mentioned above, I also wish to have two punctuation symbols at my disposal, each of which contains six symbols: ',' (comma = "--..--") and '.' (period = ".-.-.-"). Happily, Cdhmanning's scheme can be adapted to accommodate this as shown below, where a 'x' character indicates "don't care" (we'll set these to 0 in the code) and a '?' character indicates a symbol (0 = dot, 1 = dash):   // 000 xxxxx 0 symbols (not used) // 001 xxxx? 1 symbol // 010 xxx?? 2 symbols // 011 xx??? 3 symbols // 100 x???? 4 symbols // 101 ????? 5 symbols // 11 ?????? 6 symbols   Jacob Larsen also commented: "You don't mention what you are optimizing for -- code size, code maintainability, data size, or instruction speed? I am going to assume maintainability, since Morse code generally doesn't go fast enough to need speed optimization."   Jacob makes a very good point. Transmitting Morse code at human-readable speeds means that -- most of the time -- the processor is sitting around twiddling its metaphorical thumbs, so speed optimization isn’t really an issue. On the other hand, it's always a good idea to minimize your code and data size where possible, and it's also gratifying to make your code run as efficiently and as quickly as possible, even if no one else really gives a hoot. For this particular project, I feel that understandability and maintainability are the two key elements. The reason for this will become clear when I rent the veils asunder and explain the true purpose behind this capriciously cunning contraption, but we'll leave that for a future column.   For the purposes of this exercise, I've decided to compare two different techniques: (a) Storing the Morse code symbols as strings, and (b) packing the symbols for each character into a single byte. Now read on...   Storing symbols as strings Before we plunge into the fray with gusto and abandon, let’s first discuss a few "housekeeping" issues. The thing about Morse code is that everything is relative, because some people can transmit it faster than others. As we know from our original Morse code chart, a dot equates to one element (unit) of delay (time) and a dash equates to three elements of delay.   Meanwhile, the inter-symbol (dot-dash) spacing is one element of delay; the inter-character spacing (i.e., the spacing between adjacent letters in a string like "Hello") is three elements of delay; and the inter-word spacing (i.e., the spacing between two words in a string like "Hello World") is seven elements of delay.   Baring this in mind, what does it actually mean when you hear people say that they can transmit 'n' words per minute in Morse code? This is obviously somewhat subjective, because different words have different numbers of characters; there's a heck of a difference between the number of characters in "A" and in "Antidisestablishmentarianism," for example.   For this reason, one of the "standard words" Morse code aficionados use to compare themselves against each other is PARIS. Consider the following breakdown, where {} represents the delay elements associated with a dot or a dash; {3} {3} {1} (3) = 14 A ".-"   {1} {3}                 (3) =  8 R ".-."  {1} {3} {1}         (3) = 10 I ".."   {1} {1}                 (3) =  3 S "..."  {1} {1} {1}         (7) = 12                                    Total = 50   One minute equals 60 seconds equals 60,000 milliseconds (ms). Thus, if we were to transmit at one word per minute -- and assuming 50 elements per word based on PARIS as discussed above -- then each element would take 1,200ms. This explains why you will see the following definition in my code:   #define elementDelay 1200   By trial and error, I've discovered that 15 words per minute is pleasing to my eye, which is why you will also see the following definition in my code:   #define WPM 15   Thus, the way in which I've implemented my code means that you only have to tweak the WPM definition in order to modify the transmission rate. And, speaking of my code, you can access the entire program by clicking here .   The interesting part of this program starts with our definition of the dots and dashes, which appears as follows:   char* dotsNdashes contains a pointer to a string containing ".-" (plus a hidden terminating '\0' null character); dotsNdashes contains a pointer to a string containing "-..."; and so forth.   Now, I want my final program to be able to display any ASCII string that comes its way. For the purpose of this example, however, my main loop is pretty simple as shown below: void loop() {   displayThis("Hello World"); } The idea is that I can pass any ASCII text string to my displayThis() function. As you'll see, the first thing we do in the displayThis() function is to convert any lowercase alpha characters 'a' to 'z' into their uppercase counterparts 'A' to 'Z'. We could just tell people that they can only use uppercase letters, but we all know that they aren’t going to listen (LOL).   You'll also see that we check for any unexpected characters and -- if we find any -- we call a simple eeekError() function that flashes the LED in a certain way (we vary the LED's on-off times to reflect different errors).   The body of the displayThis() function is used to determine which entry (string) we wish to use in our dotsNdashes array consumes a byte (char) for each dot and dash, which is a lot of memory (relatively speaking). Next, we'll consider packing all of the symbols forming a character into a single byte.     Continued at Comparing techniques for storing Morse code in C (Part 2)
  • 热度 19
    2015-7-31 18:56
    1826 次阅读|
    1 个评论
    OK, before we begin, let's remind ourselves that I'm a hardware designer by trade. When it comes to software in general -- and C in particular -- I know only enough to be dangerous.   I have a project on the go at the moment. This is a secret squirrel project whose purpose -- for the present -- must remain a riddle, wrapped in a mystery, inside an enigma. Suffice it to say that I wish to use a light-emitting diode LED to flash messages in Morse code.   My question pertains to the best way in which to represent, store, and access the Morse code dots and dashes used to represent the numbers and letters.     Ultimately, I want to be able to initialize a string with a message I wish to transmit using a statement like:      char my_string = "HELLO WORLD";   Later, I'll pass this string as a parameter into the function that will actually flash the LED. So, once again, my question is: "What's the best way to represent the dots and dashes associated with the various characters?" My very first pass thought was to store these as character strings themselves -- something like the following:      char string_A = ".-";    char string_B = "-...";    char string_C = "-.-.";       :   I know that this technique of using a separate character for each dot and dash doesn’t give the smallest memory footprint, but it does have the advantage of understandability, which is, of course, an important consideration. On the other hand, the scheme shown above -- having individual strings for each of the character -- would be difficult to use.   The bottom line is that every approach I can think of is less than optimal or incongruous, which is why I'm inviting your ideas. If you think of anything -- preferably pertaining to this perplexing poser -- please post it as a comment below.
  • 热度 24
    2015-7-31 18:54
    1472 次阅读|
    3 个评论
    OK, before we begin, let's remind ourselves that I'm a hardware designer by trade. When it comes to software in general -- and C in particular -- I know only enough to be dangerous.   I have a project on the go at the moment. This is a secret squirrel project whose purpose -- for the present -- must remain a riddle, wrapped in a mystery, inside an enigma. Suffice it to say that I wish to use a light-emitting diode LED to flash messages in Morse code.   My question pertains to the best way in which to represent, store, and access the Morse code dots and dashes used to represent the numbers and letters.     Ultimately, I want to be able to initialize a string with a message I wish to transmit using a statement like:      char my_string = "HELLO WORLD";   Later, I'll pass this string as a parameter into the function that will actually flash the LED. So, once again, my question is: "What's the best way to represent the dots and dashes associated with the various characters?" My very first pass thought was to store these as character strings themselves -- something like the following:      char string_A = ".-";    char string_B = "-...";    char string_C = "-.-.";       :   I know that this technique of using a separate character for each dot and dash doesn’t give the smallest memory footprint, but it does have the advantage of understandability, which is, of course, an important consideration. On the other hand, the scheme shown above -- having individual strings for each of the character -- would be difficult to use.   The bottom line is that every approach I can think of is less than optimal or incongruous, which is why I'm inviting your ideas. If you think of anything -- preferably pertaining to this perplexing poser -- please post it as a comment below