热度 30
2015-1-5 21:20
1521 次阅读|
0 个评论
This is a followup to my previous blog on this topic, the "crux of the biscuit" being that I need a sound effects shield for my Arduino/chipKIT-powered hobby projects. In the case of my Vetinari Clock , for example, I want to start with a "tick-tock" sound, and I want this sound to be overlaid with other effects, giving the impression of mechanical mechanisms and pneumatic processes working away in the background. Similarly, in the case of my Inamorata Prognostication Engine , when someone flicks a switch, I want to hear the sounds of balls rolling down chutes and cunning clockwork contrivances performing their magic. One very important point is that I don't wish to synthesize these sounds; I want to use real-world sound snippets, such as the ones you can download for free from websites like Freesound.org . These typically -- and not surprisingly -- sound so much more realistic. Another important point is that I have no desire to premix all these sounds. I want to be able to set multiple sounds running at the same time, or to have one sound running and then kick more sounds off at later times as required. As an aside, the concept of mixing multiple audio streams is quite interesting. Suppose we start with two streams, A and B. For simplicity, we assume each stream can have a value between 0 and 1. Our knee-jerk reaction might be simply to add them together and divide the result by two; that is, Z = (A + B)/2. However, a moment's thought reveals that this is valid only in the case where both A and B are at the maximum 1 value. Suppose that A is at 1 while B is at its minimum 0. If we use our original equation, we end up halving the value of A, which is not what we want. A much better approach requiring minimal computational overhead is to use the formula Z = (A + B) - A*B, which allows the contributing signals to be heard clearly without distortion or perceived loss of volume. Viktor Toth explains all of this very nicely in his post " Mixing digital audio ," but I digress. My chum and EE Times blogger Duane Benson is also interested in having access to these sound effects shields for his own hobby projects. Having searched the Internet, however, we've come to the conclusion that nothing out there really seems to fit the bill, so we've decided to develop our own. The more we think about this, the more we think this sort of card would be of interest to many of our fellow hobbyists. We have toyed with the idea of launching another Kickstarter project, but -- to be honest -- that's probably more trouble than it's worth. As usual, of course, we will make this project open source and provide all the design and software files so anyone can join in the fun. Also, depending on the amount of interest we receive, it may be that we decide to offer these little beauties for sale, but we're not making any promises here. The main thing, as far as I'm concerned, is that I have these shields for use in my own projects. The first thing to do in this sort of situation is to tie down the specification. Actually, this would be true in an ideal world, but Duane is already off and running experimenting with SD memory cards, so we really need to hammer out the specification as soon as possible. Otherwise, goodness only knows where we'll end up. For the past few weeks, Duane and I have been bouncing ideas back and forth on the phone. Our mission is to create something really, really useful while keeping the cost and complexity down as much as possible. What I'm going to do here is to summarize our thoughts and ask you to comment on our ideas and to make your own suggestions. File format: We aren't interested in supporting every audio file format known to humankind. We both have great respect for Adafruit. Since it decided that 22KHz uncompressed monophonic WAV files were good enough for its Wave Shield , we've decided to go with the flow and adopt this as our format. This means that, whatever format is used to store the original sound snippets, we will suck them into some sound editing program like Audacity and output them as 22KHz uncompressed mono WAV files. File naming: For the sake of making our lives simple, we will store our sound files with a traditional 8.3 alphanumeric file naming convention using only uppercase alpha characters and numbers. File storage: We considered creating our shield with a USB interface so that it would look like a USB memory stick when connected to a host computer, but we decided this would overcomplicate things. We've decided to go the SD card route. In this case, we will copy the WAV files from the host machine on to an SD card and then plug the SD card into our shield. File lengths: Each sound file can be any length required -- limited only by the available storage. Number of streams: This has yet to be established. The maximum number of audio streams will be determined largely by the SD card interface bandwidth, but processing considerations may play a part. As a working goal, a minimum of six concurrent audio streams should be supported. Mono vs. stereo: : Even though the audio/WAV files are mono, the shield will support two separate audio output channels that we'll refer to as "Left" and "Right" for simplicity. (The way this will work is discussed below.) Amplification: The shield will contain its own amplification (amount to be decided). By default, both the Left and Right channels will be enabled, but it will be possible to disable either or both channel amplifiers via jumpers and to feed the corresponding line outputs to off-board amplification. Interface to Arduino: We've decided that this will be via I2C, and that the two pins used for the I2C interface will be the only Arduino pins affected by our shield. Also, our shield will have some default I2C address (to be decided), but it will be possible to modify this using soldered jumpers on the shield. We might also offer SPI support in addition to I2C, but this has yet to be decided. Things may seem relatively simple thus far, but when we consider the actual usage model, things become a little more mind-boggling. We plan on providing an Arduino library and example sketches to accompany our sound effects card. Let's assume that we instantiate our sound card as an object called something like "sound_fx." Assuming that some sound files stored on the SD card are called "VCSFX001.WAV," "VCSFX002.WAV," and "VCSFX003.WAV," the simplest command scenario might look something like: sound_fx.setup("VCSFX0001.WAV"); sound_fx.play("VCSFX0001.WAV"); There will be a number of parameters associated with the ".setup" command. Any parameters that aren't specified will automatically adopt their default values. For example, the sound file(s) associated with each ".setup" call will be sent to both output channels by default, but it will be possible for the user to explicitly specify "LEFT," "RIGHT," or "BOTH"; for example: sound_fx.setup("VCSFX0001.WAV", channel=LEFT); sound_fx.setup("VCSFX0002.WAV", channel=RIGHT); sound_fx.play("VCSFX0001.WAV, VCSFX0002"); Observe that the ".play" statement can accept a comma-separated list of file names, in which case all the files in the list will start playing at exactly the same time. (In addition to this, should we provide a delayed start capability?) Now, what should happen if a ".play" command is issued to a file that is already playing? Off the top of our heads, it would appear that there are three possibilities: Terminate the current instantiation and restart this file from the beginning. Add this new instantiation into the queue and start playing it immediately after the current instantiation finishes running. Treat this new call as an independent audio stream and start playing it concurrently with the current instantiation(s). I think we will go with the third option, because it's easy to envision a case where you might want to have multiple instantiations of the same sound file playing offset in time. For example, imagine the whistling sound of a bomb falling -- you might want to start a second instantiation of this sound before the first has finished. Furthermore, we can implement the first two options by means of the ".remaining" and ".stop" commands discussed below. It should be possible to assign a volume to each file in its ".setup" statement, where the assigned value is an integer between 0 and 100 that will represent a percentage. (The default setting will be 100%.) For example: sound_fx.setup("VCSFX0001.WAV", volume=100); sound_fx.setup("VCSFX0002.WAV", volume=75); sound_fx.play("VCSFX0001.WAV"); delay(1000); sound_fx.play("VCSFX0002.WAV"); In the above example, observe that, in addition to being set to 75% of its full volume, the "VCSFX0002.WAV" file will start playing 1,000 milliseconds (one second) after the "VCSFX0002.WAV" file. Additional parameters that can be used in a ".setup" statement will be "fade_in" and "fade_out," each of which can be assigned an integer value in milliseconds. In the case of a "fade-in," that file will ramp up in volume (in a linear manner), starting at zero and increasing over the specified time until it reaches whatever level was assigned to the "volume" parameter. The way in which the "fade_out" parameter is used is left as an exercise for the reader. It doesn't matter how many audio streams are currently playing (up to the maximum number supported, of course). At any time, it should be possible to add a new sound file/audio stream to the mix. It should also be possible to determine how long a file has left to run in milliseconds. Assuming that we've already declared a long integer called something like "time2Go," it should be possible to use a statement along the lines of: time2go = sound_fx.remaining("VCSFX0003.WAV"); If the file has already finished playing, a value of -1 will be returned. Last but not least, it should be possible to terminate the playing of any file using a ".stop" command. For example: sound_fx.stop("VCSFX0003.WAV"); As for the ".play" command, the ".stop" command should accept a comma-separated list of files. Also, no problems should result from issuing a ".stop" command on a file that has finished playing or was never playing in the first place. I've mentioned this before, but I should probably state it again -- neither Duane or myself are audio experts. We're pretty much making this up as we go along, and we would very much appreciate any input you might care to share with us. For example, do we need to work with 16-bit audio data, or would 12-bit data be sufficient? If we work with 16-bit data "internally," would it be OK to reduce this to 12-bit data when outputting the audio? Which microcontroller (MCU) would best suit this application? Would a dual-core device boasting an ARM Cortex-M0 and an Cortex-M3/M4 be the way to go, or do you think we can do it all with a single Cortex M0? Is there a device with suitable digital-to-analog converters (DACs) on-chip, or should we use external DACs? Do you like the ".setup", ".play", ".remaining", and ".stop" functionality we're pondering? Would you use different terminology? Would you implement this in some other way? In addition to being able to associate "channel," "volume," "fade_in," and "fade_out" parameters with each file as part of the ".setup" command, are there any other features and functions we should consider implementing? Actually, thinking of features and functions, in addition to playing a sound file a single time and then stopping, should we offer the ability to keep on playing a sound file over and over again? If so, how should we implement this? One option would be as a "run" parameter in the ".setup" command with two options -- ONCE (the default) and LOOP. For example: sound_fx.setup("VCSFX0001.WAV", run=LOOP); sound_fx.play("VCSFX0001.WAV"); Another possibility would be to complement the ".play" command with a ".loop" equivalent. For example: sound_fx.setup("VCSFX0001.WAV"); sound_fx.loop("VCSFX0001.WAV"); However, we previously suggested that the ".play" command should support a comma-separated list of files, each of which may be of a different length. The fact that the files can be of different lengths isn't a problem if each file is played only a single time, but what happens if we are playing them multiple times? Should each file restart as soon as it finishes, or should the shield wait for the longest file to finish before restarting them all from the beginning? Actually, there are multiple issues with the implementation plan discussed above. For example, suppose we want to be able to start playing one instantiation of "VCSFX0001.WAV" at 100% volume out of the Left channel, and then sometime later -- while the first instantiation is still playing -- we want to start playing another instantiation of "VCSFX0001.WAV" at 75% volume out of the Right channel. We can't do this with our current model -- or can we? Similarly, what happens if we have two or more instantiations of "VCSFX0001.WAV" playing and we run a ".remaining" or a ".stop" command with this file name as a parameter? To which instantiation should the command apply? One possibility would be to associate an ID number with each ".setup" command, and for the other commands to use these ID numbers as opposed to file names, but this is confusing to the user, and it opens up additional cans of worms. Of course, it may be that I'm overengineering everything. (It wouldn't be the first time.) There are ways around these things. For example, consider the following: sound_fx.setup("VCSFX0001.WAV", channel=LEFT); sound_fx.play("VCSFX0001.WAV"); delay(1000); sound_fx.setup("VCSFX0001.WAV", channel=RIGHT); sound_fx.play("VCSFX0001.WAV"); This would allow us to play the same file out of the different channels at different times (and with different volumes if we specified the "volume" values). Similarly, we could simply say that, for multiple instantiations of a file running at the same time, the ".remaining" and ".stop" commands work with the earliest instantiation. Hmmm, this is something to mull over, and no mistake. This is a typical engineering problem in that we have lots of tradeoffs that compete against one another. On the one hand, we want to make our sound effects shield as versatile as possible, so that it is applicable to many different usage scenarios. On the other hand, we want to make it as easy to understand and use as possible, because a lot of its potential end users aren't computer experts. This is the point where we open the floor for suggestions. Do you like the model proposed above? Would you make additions, subtractions, or modifications? Or would you throw it all out and start again?