Describing sound can be a little difficult to say the least, especially as one musicians “warm” is another ones “muddy” so, having a handy list of words to describe music will give your mixing engineer a clearer picture of what it is you’d like to change in your mix feedback notes.
Subjectivity aside, here’s my list of common descriptive audio words and their meanings which will help you make more comprehensive mix feedback notes when you next sit down to go through your mix revisions with your band members.
Using the right descriptive terminology will save you time, revision rounds and ultimately money, every time you work with an online mixing service.
Any good mix engineer should be able to understand your notes using these words to describe music and sound and also by referring back to the reference tracks you gave at the start of the project.
I find it really helpful when my clients provide audio examples to illustrate what they mean in their revision notes.
They usually provide a YouTube link and give me a timestamp so I can actually hear what they want.
My last client did this successfully with both a particular distorted vocal sound and reverse reverb effect they wanted in the mix. This saves loads of time going backwards and forwards with emails so I highly recommend it.
Audio Glossary With Words That Describe Music & Sound For Musicians:
Aggressive – Edgy, pushy sounding and a bright frequency response, “in your face”.
Airy – Spacious. Open. Instruments sound like they are in a large reflective space.
Ambience – Impression of an acoustic space such as a large hall or studio room where a recording was made.
Attack – The leading edge of a note or percussive transient (snare drum hit). The immediate starting point of the note.
Balance – The mix itself. The relative volume balance of each aspect or element of the mix. Also, stereo balance. The relative level of the left and right stereo channels.
Bass – are you referring to the bass guitar or the bass frequencies of the mix? Be specific for your mix engineer. The bass audio frequencies sit around 60Hz to 250Hz.
Blanketed – Muffled, not much high frequency, as if a blanket were put over the speakers.
Bloated – Excessive mid bass around 250 Hz. Too much low end, low frequency resonances, thick sounding. See tubby and muddy.
Blurred – Smeared, bad transient response. Vague stereo imaging, not focused. Soft sounding.
Body – Fullness, warmth, with particular emphasis on upper bass. Opposite of thin.
Boomy – Excessive bass around 125 Hz. Poorly controlled low frequencies or low frequency resonances. Too much low end.
Boxy – Sounds as if the music or instrument were recorded in a box. Why your kick sounds like a beach ball. Sometimes an emphasis around 300 to 500 Hz.
Breathy – Audible breath sounds in instruments like flute or sax. Good response in the upper mids or highs. Audible breaths in the vocals either for mood (good) or over compression (bad).
Bright – High end over exaggerated. A sound that emphasizes the upper midrange/lower treble.
Brilliance – The 6kHz to 16kHz range controls clarity. Too much emphasis in this range can produce sibilance (“ess” sound) on the vocals.
Cheap – Too much emphasis in the 800Hz range sounding like a budget boom box or hifi system.
Chesty – A bump in the low frequency response around 125 to 250 Hz. Proximity effect. Close, intimate sounding.
Clear – See Transparent.
Congested– Smeared, confused, muddy and flat; lacking focus. See Blurred.
Coloured – Having a sound that’s are not true to life.
Cool– Lacking body and warmth, thin.
Crisp – Extended high frequency response, especially with cymbals or snare.
Dark – An overall tonal balance that lacks high end over around 5khz. Opposite of bright. Fewer high frequencies.
Decay – The fadeout of a note.
Definition – The degree of distinctness of a sound. Focus, clarity.
Depth – A sense of distance (front to back) of different instruments.
Detail – Finer aspects of a sound, often the most delicate.
Detailed – Easy to hear tiny details in the music. Good high frequency response, sharp transients.
Dry – Lack of reverb or delay as if in a non-reflective space. Opposite of Wet.
Dull – See Dark.
Dynamic – Changes in volume that generate energy and excitement.
Edgy – Too much high frequency response. Trebly. Distorted. See Aggressive.
Fat – See Full and Warm.
Flabby – Loose, out of control low end.
Focus – A strong, precise sense of image and placement.
Forward(ness) – Similar to an aggressive sound, a sense of the music being projected in front of the speakers and of being forced upon the listener. Opposite to “Laid-back”.
Full – Good low frequency response, with adequate levels around 100 to 300 Hz. Male voices are full around 125 Hz; female voices are full around 250 Hz
Gentle – Opposite of edgy. The highs are not exaggerated, or may even be weak.
Grungy – Lots of saturation or distortion.
Hard – Too much upper midrange, usually around 3 kHz. Or, good transient response, as if the sound is hitting you hard. Unpleasant, forward, aggressive sound.
Harsh – Grating, abrasive. Hurts to listen to. Too much upper midrange. Peaks in the frequency response between 2 and 6 kHz.
Highs – The audio frequencies above about 6000 Hz.
High Midrange (High Mids, Upper Mids) – The audio frequencies between about 2kHz and 6kHz.
Hollow – Missing mid frequencies, lacks warmth.
Honky – Like cupping your hands around your mouth.
Imaging – The sense that a voice or instrument is in a particular place in the room.
Laid-back – Relaxed, distant-sounding, opposite to Forward
Low Level Detail – The quietest sounds in a recording.
Low Midrange (Low Mids) – The audio frequencies between about 250Hz and 2000Hz.
Lush – Very Rich/Full, inviting.
Mellow – Reduced high frequencies, not Edgy.
Midrange (Mids) – The audio frequencies between about 250 Hz and 6000 Hz.
Muddy – Not clear. Too much in the 200Hz area, uncontrolled low end.
Muffled – Dull. Sounds like it is covered with a blanket.
Musical (or musicality) – A sense of cohesion in the sound.
Nasal – Honky, singing through your nose, a bump in the response around 1k to 1.5k
Naturalness – Realism.
Open – Sound which has height and “air”, relates to clean upper midrange and treble.
Pace – A strong sense of timing and beat.
Piercing – Strident, hard on the ears, screechy.
Presence Range – The presence range between 4kHz and 6kHz is responsible for the clarity and definition of voices and instruments. Without it, things sound dull but too much and the mix can sound thin.
Punchy – Good song dynamics. Good transient response, with a strong impact.
Rich – See Full.
Round – High frequency dip. Not edgy. Smooth.
Saturation – Gentle distortion
Shrill – Strident, Steely. High frequency edge
Sibilant (or Sibilance) – “Essy”, exaggerated “s” or “sh” sounds in vocals.
Sizzly – See Sibilant. Also, too many high frequencies on cymbals.
Smooth – Easy on the ears, not harsh. Flat frequency response, especially in the midrange.
Soundstage – The area between two speakers that appears to the listener to be occupied by sonic images. Like a real stage, a soundstage should have width, depth, and height.
Spacious – Conveying a sense of space, ambience, or room around the instruments. Stereo reverb.
Steely – Emphasized upper mids around 3 to 6 kHz. Peaky, non flat high frequency response. See Harsh, Edgy.
Sub–Bass – The very low audio frequencies between about 20Hz and 60Hz.
Sweet – Not piercing. Delicate. Flat high frequency response, low distortion. Lack of peaks in the response. Highs are extended to 15 or 20 kHz, but they are not bumped up. Often used when referring to cymbals, percussion, strings, and sibilant sounds.
Telephone Like – See Tinny.
Texture – A perceptible pattern or structure in reproduced sound.
Thick – A lack of articulation and clarity in the bass.
Thin – Bass light. Lacks low end, often the 250Hz range
Tight – Good low frequency transient response and detail.
Timbre – The tonal character of an instrument
Timing – A sense of precision in tempo.
Tinny – Narrowband, weak lows, peaky mids. The music sounds like it is coming through a telephone or tin can.
Transient – The leading edge of a percussive sound. Good transient response makes the sound more live and realistic.
Transparent – Easy to hear into the music, detailed, clear, not muddy. Wide flat frequency response, sharp time response, very low distortion and noise.
Tubby – Having low frequency resonances as if you’re singing in a bathtub. See bloated.
Upper Midrange (Upper Mids, High Mids) – The audio frequencies between 2 kHz and 6 kHz.
Veiled – Like a silk scarf is over the speakers. Slight noise or distortion or slightly weak high frequencies. Loss of detail.
Warm – Good bass, adequate low frequencies. Not thin. Also too much bass or mid bass. Also, pleasantly spacious. Also see Rich, Round.
Wet – A reverberant sound, something with decay. Opposite of Dry.
Weighty – Good low frequency response below about 50 Hz. A sense of substance and foundation produced by deep, controlled bass.
Woolly – Loose, flabby, bloated bass.
I’ve put the glossary together in a downloadable, printable reference sheet HERE
Have I missed anything?
Is there a term that you use regularly that I can add to the list? I’d love to know!
Coming next will be an article called “How to get the best from your revision rounds when using an online mixing service“.
With limited revision rounds offered by some music mixing services I’m going to be offering you some tips and “prompting” questions to ask yourself as you go through a mix with your band members to make more effective notes that get the results you want, quickly.
So if you’re interested in getting notified when that’s published, consider signing up to my email list and you’ll be the first to know.