Practical MIDI FAQ:

Basics:
Intermediate:
- What is a "SoundFont" and how do I use it?
Advanced:
Reference:
- Links to other websites

Basics:

What is MIDI?

MIDI stands for Musical Instrument Digital Interface. Many people suffer from the misapprehension that MIDI is a tangible object. MIDI is in fact a method of communicating between multiple electronic instruments. This includes a standard connector/cable and a standardized protocol for communication. First demonstrated at the NAMM (North American Music Manufacturers) show in 1983 using two synthesizers by two different manufacturers, MIDI was used to drive both synthesizers by playing on one. MIDI has most paid off in its use with computers, enabling people to use the computer to record, edit and play back music via a synthesizer.

A standardized file format was designed to allow interchange of the saved data between multiple applications primarily on computers. This standard known as SMF (Standard MIDI File) has eventually become known as a MIDI file, often having a ".mid" file extension. This format allows anything that can be sent over MIDI to be saved and loaded. These files have become quite popular for web based music due to their small file size. SMF has various versions that store the data in different methods to allow for changes in the use of MIDI, these are often denoted as SMF0, SMF1, etc... The differences are too detailed, I may however go into that later.

MIDI is not a description of a sound or of an audio stream as many falsely believe. It is a description of music, of individual notes and effects. This is why the files can be so small. When a MIDI file is played through a synthesizer the process is much like rendering a 3D model into a picture. The individual notes are turned into sound and modified by various effects like volume, reverb, chorus, etc. The unfortunate side effect is that the quality of the rendering varies between synthesizers. Many modern sound cards contain a synthesizer onboard, this is what is used by web pages, games, etc.

What are the specifics of MIDI?

MIDI connections are unidirectional, though often a synthesizer will have a "MIDI In" and a "MIDI Out" at very least. "MIDI In" is the port through which the synthesizer or interface accepts MIDI data. "MIDI Out" is the port through which the synthesizer or interface sends MIDI data. Synthesizers and interfaces alike have these ports. There are many forms of interfaces (which allow a computer to have a MIDI connection), many sound cards have one that plugs into the joystick port. These usually have one MIDI In and one MIDI Out. Generally one plugs a synthesizer to the interface by connecting the opposite named ports. Additionally there is a connection called "MIDI Thru" on some things. Generally data that is passed to the "MIDI In" port is filtered and passed out the "MIDI Thru." So you should always view "MIDI Thru" as an output. Plugging multiple instruments together using "MIDI Thru"s is often called chaining. This doesn't yield any more channels but it allows you to have several instruments on a single output.

A single MIDI connection can represent up to 16 channels. These channels represent a single stream of notes in which effects are applied. These channels can contain any number of notes and nearly any number of effects (only limited by the synthesizer.) Each channel is assigned a patch, this instructs the synthesizer as to what sort of instrument should be played for that track. The patch can be changed, but there is only ever one assigned to a given channel at any given time. A track can be manipulated using controllers, RPNs (Registered Parameter Number), and NRPNs (Non-Registered Parameter Number), which are all various methods for accessing parameters in the synth. Controllers are the most standard method, they are generally fairly standard across all the various synths. For example, Reverb is commonly changed by controller 91, chorus by controller 93, volume by 7, etc. NRPNs and RPNs are mostly at the discretion of the manufacturer, often outlined in detail in the manual of the synthesizer.

Boring techno-weenie details

Connection speed:	31250 bits/sec
Connector:	5-pin DIN
Max. Cable Length:	50ft
Suggested Max.:	20ft
Channels:	16

If MIDI contains no sound data, how is it usable between computers?

This isn't true of all MIDI files, but many conform to a standard called GM (General MIDI.) This specifies a standard set of patches so that the music sounds appoximately the same. There are two other more extensive standards used for MIDI files, GS by Roland and XG by Yamaha, but these are a little more rare. More detail will be given in the next section.

What is GM, GS, or XG and how does it relate to MIDI?

In September of 1991 the MIDI Manufacturers Association (MMA) and the Japan MIDI Standards Committee (JMSC) created the General MIDI standard, later referred to as GM. This standard specified:

Voices: A minimum of either 24 fully dynamically allocated voices are available simultaneously for both melodic and percussive sounds, or 16 dynamically allocated voices are available for melody plus 8 for percussion. All voices respond to velocity.
Channels: All 16 MIDI Channels are supported. Each Channel can play a variable number of voices (polyphony). Each Channel can play a different instrument (sound/patch/timbre). Key-based percussion is always on MIDI Channel 10.
Instruments: A minimum of 16 simultaneous and different timbres playing various instruments. A minimum of 128 preset instruments (MIDI program numbers) conforming to the GM Instrument Patch Map and 47 percussion sounds which conform to the GM Percussion Key Map.
Channel Messages: Support for continuous controllers 1, 7, 10, 11, 64, 121 and 123; RPN #s 0, 1, 2; Channel Aftertouch, Pitch Bend.
Other Messages: Respond to the data entry controller and the RPNs for fine and course tuning and pitch bend range, as well as all General MIDI Level 1 System Messages.

In english this means that you could have 16 seperate instruments playing at least one note apeice. These could be chosen from a standard set of 128 patches. These patches are not always exactly the same but they always are assigned to the same type of instrument. For example, patch 0 is always an acoustic piano and 1 is a bright acoustic piano. GS and XG are extensions of GM that specify additional banks of patches in addition to extra controllers, etc. They were designed by Roland and Yamaha respectively.

What does velocity mean and what is it good for?

Velocity is the force with which the note is struck. This makes a lot of sense when you think of a piano, however in MIDI it's used in relation to all instruments. At it's simplest, velocity is how loud a note is played, but modern synthesizers generally go far beyond this. Think of a piano, the harder you play a note, the louder it gets, but also it gets slightly discordant and has more bite. Velocity data is sent along with each note as it is played and cannot be altered in the middle of a note. Most modern keyboards are now also capable of recording velocity information. The general representation of velocity is a number from 0 (soft) to 127 (hard.) The exact effect velocity has is undefined but generally the sound programmers try to make it make sense for the instrument.

What is aftertouch?

Aftertouch is the notation given to pressure sensitivity on a keyboard. This is different than the initial parameter velocity in that it is the application of additional pressure after the key is already down. Many more expensive keyboards have aftertouch. Generally this is tied to some parameter that makes the sound more harsh (though sometimes the opposite.) This means that let's say you have a cello, by leaning on the key harder that's already down the note might get louder and harsher, thereby simulating bowing. Aftertouch comes in two varieties, channel aftertouch and polyphonic aftertouch. Sometimes these are also known as "channel pressure" and "key pressure" respectively. Channel aftertouch is by far the most common in keyboards due to its more simple (and therefore cheaper) design. This means that the aftertouch is global to the MIDI channel and leaning on any note causes an acoustic change in all notes. Polyphonic aftertouch means that each note can have it's own seperate aftertouch. Keyboards with polyphonic aftertouch are extremely rare and generally valued.

What are polyphony and voices and how do they relate to what I play?

These terms are thrown around often and vary in their meanings to a degree. Polyphony in it's strictest sense is the number of concurrent noises that can be played at once. This is also sometimes referred to as voices. The confusion comes in that many synthesizers don't only use one voice per key pressed. For instance if you had a really thick pad sound it might use 4 voices. This means that for every key you press it uses up 4 polyphony. Therefore if you had a synthesizer that was denoted as 64 voice polyphonic, you would in fact be able to play 16 notes of such an instrument.

What does multi-timberal mean?

Multi-timberal simply put is the ability to play multiple sounds (timbres.) Often synthesizers are denoted as "16 multi-timberal." This means that it can produce up to 16 different sounds at once. The GM specification requires this, though most modern synthesizers are 16 part multi-timberal regarless of their conformance to GM or not. Multi-timberal is now often not mentioned, but implied when the number of MIDI channels is specified.

What are banks and what does MSB and LSB mean?

Banks most simply put are collections of 128 sound presets. Banks are selected using MSB (Most Signifcant Byte) and LSB (Least Signifcant Byte) messages. The names are a bit of a misnomer if you know anything about computers. In computers a byte is 8-bits, in MIDI it's a 7-bit number, this means the number can be from 0 to 127. Many synthesizers give bank numbers as a single number. This relates in the following way:

Bank Number = MSB * 128 + LSB

So if you divide the bank number by 128, the answer you get is the MSB, the remainder is the LSB. These messages are sent using controller messages 0 and 32, but almost all MIDI software handles that aspect for you.

NOTE:If you're using MED, be sure to add one to the LSB. If you had to select bank 4120 (MSB: 32, LSB: 24), you would enter "32/25" in the bank field.

What are controllers?

Controllers are a from of MIDI message that can specify control signals to the synthesizer in the range of 0 to 127. Some controllers are in the range of 0 to 16383 and are broken up using MSB and LSB. Commonly these controllers are used as a standardized way to drive effects. Some common controllers are:

Number	Effect/Action

0	Bank Select MSB
1	Modulation Wheel MSB
2	Breath Controller
7	Channel Volume
10	Pan ( Left (0) to Center (64) to Right (127) )
11	Expression (Relative Volume in relation to Channel Volume)
32	Bank Select LSB
33	Modulation Whell LSB
64	Sustain Pedal
66	Sustenuto Pedal
67	Soft Pedal

Intermediate:

What is a "SoundFont" and how do I use it?

Soundfonts were developed by E-mu between 1992 and 1994 for Creative to be used in their SoundBlaster AWE32 card. As the name implies the general idea behind soundfonts is to change instrument data as simply as changing the font in a document. This is done by providing waveform data and information about what ranges of notes and velocities they are mapped to, as well as some simple effects. Originally the soundfont data was stored in RAM on the sound card itself, but newer cards like the Soundblaster Live and Audigy now use the computer's own RAM. This enables much larger and more complicated soundfonts to be used. The economy provided by these cards is that this data is handled by the sound card and not the computer so very little CPU power is used. Also, this makes the use of soundfonts independant of application; all you need is something that can send MIDI commands.

The rest of the information I'll provide is based on the Soundblaster Live cards (Audigy should be the same too,) as they're the most common. When you have one of these sound cards installed there is an extra control pannel called "AudioHQ," this allows you to configure various settings. If you open that control pannel, you'll see another sub-control panel called "SoundFont," open this one also. Here you'll see several tabs, "Configure Bank," "Configure Instrument," and "Options." First and foremost under "Options" you'll see a section called "SoundFont Cache." This section allows you to set the maximum memory that'll be used for soundfonts. However, this amount won't necessarily be used unless you load enough sound fonts to fill it. It should allow you to set this up to half of your RAM, but just pick a sane number for your purposes. If you generally use 1-2M soundfonts, pick something like 16M... If you use a lot of 32M-64M ones you much wanna set it near 200M. Once you have this set go back to the "Configure Bank" tab.

NOTE: Most SoundBlaster cards will cause a system lockup if you try to use a single soundfont that's over 150M.

Under "Configure Bank" you can load soundfonts into various banks. To do this select a bank (preferably an empty one) and press the load button. Here select a soundfont from your hard drive (generally they have a ".sf2" extension.) You can load multiple soundfonts into a bank, the topmost one will always take precedence, but any unfilled patches in it will be filled by those lower on the list. If a patch isn't defined in a given bank but it is in a lower bank, the lower bank will be used for that patch. For complex merging of multiple banks I'd recommend using Vienna (a free utility) to create a new soundfont file with all the desired patches in it. The remaining tab, "Configure Instrument" can be used to place single instruments in the patch list, but once again I'd recommend use of Vienna as that way you have something easy to reload.

Using soundfonts is fairly simple. If you have a Soundblaster Live, just select the MIDI device as "A: SB Live! MIDI Synth." There is also a B version that gives you another 16 channels allowing for 32 total. Once you have set the device, then select the bank that you loaded the soundfont into. Here there is a minor bit of confusion as it only uses the MSB (Most Significant Byte, though a bit of a misnomer) to set the bank. Depending on the application, MSB may be a seperate setting or not. If there is but a single setting for "Bank" then you should likely use n*128 where n is the bank number (i.e. bank 10 would be "1280".) If you're using MED then due to some peculiarities it should be entered in as "n/1" where n is the bank number (i.e. bank 10 would be "10/1", but bank 0 would simply be "1".) A few programs however take this oddness into account (Like Cakewalk/Sonar if set up right,) in that case it would simply be "10". Worst case scenario you can also send this number using controller 0 if there are no other means. After this all you need to do is select a patch and play. As a side note MED requires you to add 1 to the patch number that you see anywhere else to get the appropriate sound.

Advanced:

What are the specifics of the GM standard?

(Answer soon to come.)

What are the specifics of the GS standard?

(Answer soon to come.)

What are the specifics of the XG standard?

(Answer soon to come.)

Reference:

Links to other web sites:

Introduction to MIDI - A brief antroduction to MIDI taken from a magazine article.
MIDI Manufacturers Association - The website for the MIDI Manufacturers Association. Lots of very in depth information about various MIDI standards.