原创 手机音频设计的演变 Part3 ( English )

2009-9-23 08:31 1727 7 7 分类: 智能手机
作者:    时间:2009-02-17    来源:52RD手机研发 
 
      

Music Synthesizers
Music synthesizers for cellular phones are derived from the same types of synthesizers used on Personal Computer sound cards. The most widely used synthesis methods are; (1) Frequency Modulation –FM, (2) wave table, or (3) a combination of FM and wave table.


FM synthesis was originally done using analog elements, but now is done using digital signal processing techniques. Each FM instrument sound Note (or “voice”) requires a minimum of two signal generators/oscillators, one of which modulates the other. Thus, to play many Notes simultaneously requires many oscillator pairs. Some synthesizers require 4 or 6 oscillator pairs in order to create a particular sound. The number of oscillators used per sound greatly affects the total Note polyphony capability of the synthesizer. FM synthesis techniques cannot exactly duplicate a real traditional physical instrument.


If the goal is to re-create the sound of some existing real physical instrument, then digital sample-based techniques are used. Wave table synthesis uses segments of sampled instrument sounds. The Note polyphony capability of a wave table synthesizer is a function of the processing power needed to playback multiple digital audio samples simultaneously. The synthesizers are controlled by MIDI (Musical Instrument Digital Interface) commands in accordance with specifications set up by the music industry supported MIDI organization.


To ensure that musical content (MIDI command files) played back on one synthesizer sounded the same when played back on another synthesizer, the makers of synthesizers standardized on (1) a 127 melodic non-percussion instrument sound set and its numbering system, plus (2) a 47 percussion instrument sound set assigned to Notes and its associated numbering system, and (3) a set of performance commands that would allow better ‘expression’ of musical passages. Taken together, this standard is called General MIDI, or GM.


In addition to the GM specifications, the music instrument industry and the MIDI Organization have additional specifications that are sometimes used in cell phones. They are:


SP-MIDI (Scalable Polyphony MIDI Specification & Profiles): SP-MIDI devices can automatically change downward from maximum Note polyphony to a much lower Note polyphony (Note stealing) if more power was needed for some other application or for conserving battery power. GM-Lite: GM Lite Specification and Guidelines for Mobile Applications: GM Lite is a subset of GM, and requires a fixed 16 Note polyphony along with a reduced set of performance commands. DLS-1 & DLS-2: Downloadable Sounds Specifications: The DLS specification provides for downloading new instrument digital sound (WAVE format) samples and performance parameters and effectively extends the GM Instrument Set, or in some cases, substitutes one GM Instrument sound for another, improved version. MDLS: Downloadable Sounds 2.1 Specification: For GM, DLS-1 and DLS-2, there was a distinction between melodic sounds and percussion sounds. In MDLS there is just one bank or collection of sounds. Wavetable sounds can use other encodings plus compression besides linear 8 or 16 bit linear PCM. XMF: eXtensible Music File (XMF) Specification v.1.01: XMF is an open standard file format “wrapper” for bundling MIDI commands, Downloadable Sounds (DLS), and WAVE PCM data. MIDI and digital audio data are encapsulated in the XMF file or can be linked via external URL reference. Mobile XMF: Mobile XMF is an open standard audio format (.mxmf format) that bundles via XMF a tiny SP MIDI file, custom compressed digital audio samples based on DLS-2 format (Mobile DLS), and copyright information all in one deliverable package optimized for use in mobile applications. Playback quality is guaranteed since any standard Mobile XMF file will render consistent playback on all Mobile XMF-enabled devices. Further information can be found at http://www.midi.org .


Currently, most ring sounds are usually MIDI command files for the synthesizer, not the actual music itself. The MIDI commands are typically in Standard MIDI Format 1 which stores MIDI data as a collection of tracks, instead of Standard MIDI Format 0 which stores all MIDI data on one track. Each MIDI track can be thought of as an instrument. Multiple MIDI tracks with Note events in each track playback multiple GM instrument sounds in a song arrangement. Thus, you have a “band in a box” suitable for playing many different musical genres.


The number of Note events in a composition is directly related to the synthesizer Polyphony, i.e. the number of simultaneous note events (polyphony) the synthesizer can support. The larger the polyphony, the more complex musical compositions could be. This fact directly leads to the “Polyphony Wars” where the polyphony values have steadily progressed from 4 to 72 (currently) for cell phone HW synthesizers, with expectations that it will be 128 very soon.


Most musical compositions using the synthesizer are for ring sounds and games. However, the best professional commercial music compositions rarely use such high polyphony (simultaneously depressed Notes), since too many simultaneous sounds will tend to sound “noisy” while a lesser number of instruments in a composition will be easier to hear and appreciate. So handset makers are paying for much more polyphony in the synthesizer block than is needed. The “cure” for polyphony issues is to use MP3 music clips. Then the polyphony is unlimited, or more precisely, it is limited only by the number of instruments in the actual, real, recording.


The key value in using a synthesizer for music is that the MIDI files holding the synthesizer commands are very small compared to MP3 or other compressed “real” music files. There are multiple vendors of hardware based synthesizers conforming to General MIDI standards offering from 16 to 72 note polyphony. Some support DLS (downloadable samples) for customizing the instrument sounds. Some synthesizers support SMAF (Synthetic music Mobile Application Format) low resolution 8 KHz sample rate, 4 bit ADPCM digital audio samples which is used in combination with MIDI files for creating ring tones and gaming sounds. hardware synthesizers are almost always stereo analog output devices even though internally the synthesis function is done using a digital DSP. Some hardware synthesizers also have a motor vibrator control line and an audio synchronization LED control line. There are also multiple vendors of software-based synthesizers with essentially the same capabilities as hardware synthesizers. The software synthesizers run on either the Communications or Applications processors in the handset.



The key advantage of software synthesizers is the fact that they can be incorporated into the media player software, which is then capable of playing back and/or mixing various audio file type decoders (MIDI, SMAF, WAV, MP3, etc.) simultaneously to generate interesting and compelling musical content. The output is 2 channel PCM data, usually in I2S format. The key disadvantage is that all these features consume processing power and the synthesis method requires memory storage.


Fortunately, the software synthesizers can be more intricately linked to all running processes, and be scaled back in capability similar to SP-MIDI if other applications require more processing power. This is an important factor to consider since the trend is towards more multi-channel, high quality digital audio on handsets, primarily for games.


Some hardware synthesizer components are now including data converters, switching and mixing blocks, and audio power amplifiers. This approach may be attractive in some designs, but is not very flexible when it comes to adding in other audio related services. A centrally located audio subsystem is usually the better choice.


Radio Modules
Commercial radio is available on cell phones already, generally in the form of add-on FM, or AM/FM modules. However single chip solutions exist as well. The latest solutions also include support for RDS (Radio Data System) and RBDS (Radio Broadcast Data System – RDS in the United States) demodulation and decoding. The broadcast data services allow the display of station name, program, and other information about the program. The chips and modules are generally controlled by the I2C serial bus interface.


FM radio provides an analog stereo signal, while AM radio is analog mono. The audio output levels vary quite a bit between the different solutions available, and can range from 100mVrms to 220mVrms. Depending upon the final system design, an additional amplifier stage may be required to properly drive other system components.


Eureka 147 is a protocol for digital radio broadcasting and is commonly known as Digital Audio Broadcast (DAB). DAB is transmitted on terrestrial networks and uses multi-path reception conditions to optimize receiver sensitivity by always selecting the strongest regional transmitter automatically. DAB can carry audio, text, pictures, data and even videos.


S-DMB (Satellite Digital Mobile Broadcasting) is transmitted from satellites to the ground and suffers from “shadow areas” which interrupt the reception. Those “shadow areas” are solved by ground based repeaters. T-DMB is the terrestrial based broadcast version of DMB.


Complete DAB/DMB solutions on-a-chip support various digital audio standard decoding (MPEG-2, MP3, AAC, AC3 as a partial list) and provide output data on either I2S or SP/DIF. An external DAC and amplifier are required to get analog output to speakers or headphones. Newer chips are incorporating the D/A’s and provides analog audio output.


Solutions are also available in module form and consist of DMB/DAB tuners, media processor, and provide audio output in digital and analog form. The media processor could be a separate DSP type chip, or the decoding functions could be handled in software by the Applications Processor.


The trend is toward system-on-a-chip or modules so that desired functions can be added relatively easily.


Television
Various methods exist to make the television content available. DVB-H (Digital Video Broadcasting – Handheld) is a terrestrial digital TV standard that is a slimmed-down version DVB-T (Digital Video Broadcasting – Terrestrial), and allows receiving real time digital TV broadcasting from the digital TV network without using the mobile phone networks. Chips and modules supporting DVB-H are similar to those for DMB/DAB solutions for Radio services. The audio content is decoded and presented to the system in either digital (I2S) or analog form (modules with media processor and stereo DAC), or both.


Also, S-DMB or T-DMB allows the transmissions of video, e.g. video clips. Users are unlikely to watch full length TV programs on a cell phone, but might be interested in “mobisodes” – 1 to 3 minute mini-episodes or key scene clips of programs. Train and bus commuters are the most likely users on a regular basis.


It is expected that “portals” or stations will emerge specializing in such limited time TV content. Already, many such portals have been launched in 2005 specializing in various types of interactive content for cellular phones. TV “snippets” is a natural extension of those services.


Haptic devices
A haptic device is a mechanical device which provides some mechanical feedback to the user. A typical example would be the vibrator motor which is used for the silent ring function on a cell phone.


Manufacturers are attempting to synchronize vibrator motors to sound so that the motor vibrates along with the sound of a ringing tone as well in synchronization with lighting patterns.


Suppliers of games on cellular handsets are also interested in haptic devices in order to provide Force Feedback information to the game player. These devices are typically triggered by audio sounds in the games such as explosions gunshots, loud voices, and others sound effects.


Logically, amplifiers in the audio signal path would be used to drive these haptic devices.


Accessories
The most common accessory is the car kit which is generally plugged into the bottom of a phone which has a special connector or which may be plugged into the headset three-wire connector on the phone. The car kit accessory typically operates from the automobile battery and has an output power requirement on the amplifier of about 2 Watts.


Also included is an amplifier for the car kit microphone. The future trend is to replace the car kit microphone with a microphone array consisting of two or more microphones which will focus on the talker, reduce far field noise (road noise, radio, other passenger speech, etc.), and improve intelligibility.


As cellular phones become entertainment devices, one accessory class which is emerging in the market is the docking cradle. This cradle generally consists of stereo audio amplifiers up to 2 or 3 Watts per channel. This is primarily for playback of MP3 music while a cell phone is not in communications use or not being carried.


 


SUMMARY
No longer are cell phone voice band audio specifications sufficient. Support must be available for high quality consumer electronics oriented services and specifications. The types of audio devices in cell phones has expanded from the simple microphone, voice band codec, and speaker used in speech only cell phones to sophisticated collections of audio sources, various decoder playback devices, multiple audio output devices, and the necessary audio subsystem structures to tie them altogether in a sensible fashion. That means more different kinds of audio sources and standards, data formats, and signal levels to deal with.


Audio quality and performance specifications on cell phones are a major differentiating feature of the services and of the cell phone itself. Since complexity of audio related features is increasing rapidly, there will be the strong need to simplify architectural support to allow handset makers to mix and match features with ease according to ever changing market demand.


Kenneth Boyce is the Staff Strategic Technologist for Audio products group. Prior to joining National, Boyce served as director of the Audio and Communications Division at Oak Technology. He holds a bachelor of science degree in electronics from West Virginia University.


 

PARTNER CONTENT

文章评论0条评论)

登录后参与讨论
EE直播间
更多
我要评论
0
7
关闭 站长推荐上一条 /3 下一条