Unit 4: Assessment of Speech

4.1 Respiration and Phonation: Pre-requisites, process, types and need for assessment

4.2 Basics of Articulation and phonology (active and passive articulators; classification of vowels and consonants; assessment of articulation)

4.3 Suprasegmental aspects of speech and its assessment

4.4 Milestones of speech development in typically developing children

4.5 Speech Intelligibility: concept, factors & assessment














4.1 Respiration and Phonation: Pre-requisites, process, types and need for assessment


Speech production has two basic components: initiation and modulation. Without these components there is simply no speech.

Initiation is a movement of a speech organ to create an outgoing or ingoing flow of air through the vocal tract. The main initiation organs are the lungs, which work like a pair of bellows. When muscles are used to decrease the lung volume, the air pressure inside the lungs becomes higher than the air pressure outside the body, causing an outgoing airstream through the vocal tract. When muscles are used to increase the lung volume, the air pressure inside the lungs becomes lower than outside the body, causing an ingoing airstream through the vocal tract.

Airstreams can also be initiated by a piston-like movement of the larynx or the tongue: (1) Close the opening between the vocal cords and move the larynx up or down like a piston, or (2) close the opening between the back of the tongue and the back of the roof of the mouth and move the tongue backwards or forwards. The former type of initiation creates implosives and ejectives, while the latter type creates clicks.

Modulation is an organic posture or movement of a speech organ that changes the airstream in such a way that a sound is created. There are three types of modulation: phonation, nasality, and articulation.

Speech is produced by bringing air from the lungs to the larynx (respiration), where the vocal folds may be held open to allow the air to pass through or may vibrate to make a sound (phonation). The airflow from the lungs is then shaped by the articulators in the mouth and nose (articulation).

The field of phonetics studies the sounds of human speech.  When we study speech sounds we can consider them from two angles.  Acoustic phonetics,  in addition to being part of linguistics, is also a branch of physics.  It’s concerned with the physical, acoustic properties of the sound waves that we produce.  We’ll talk some about the acoustics of speech sounds, but we’re primarily interested in articulatory phonetics, that is, how we humans use our bodies to produce speech sounds. Producing speech needs three mechanisms.

The first is a source of energy.  Anything that makes a sound needs a source of energy.  For human speech sounds, the air flowing from our lungs provides energy.

The second is a source of the sound:  air flowing from the lungs arrives at the larynx. Put your hand on the front of your throat and gently feel the bony part under your skin.  That’s the front of your larynx. It’s not actually made of bone; it’s cartilage and muscle. 

Voice box: MedlinePlus Medical Encyclopedia Image

How Your Voice Works – Dysphonia International

What you see here is that the opening of the larynx can be covered by two triangle-shaped pieces of skin.  These are often called “vocal cords” but they’re not really like cords or strings.  A better name for them is vocal folds.

The opening between the vocal folds is called the glottis.

We can control our vocal folds to make a sound.  I want you to try this out so take a moment and close your door or make sure there’s no one around that you might disturb.

First I want you to say the word uh-oh. Now say it again, but stop half-way through, Uh-. When you do that, you’ve closed your vocal folds by bringing them together. This stops the air flowing through your vocal tract.  That little silence in the middle of uh-oh is called a glottal stop because the air is stopped completely when the vocal folds close off the glottis.

Now I want you to open your mouth and breathe out quietly, haaaaaaah. When you do this, your vocal folds are open and the air is passing freely through the glottis.

Now breathe out again and say aaah, as if the doctor is looking down your throat.  To make that aaaah sound, you’re holding your vocal folds close together and vibrating them rapidly.

When we speak, we make some sounds with vocal folds open, and some with vocal folds vibrating.  Put your hand on the front of your larynx again and make a long SSSSS sound.  Now switch and make a ZZZZZ sound. You can feel your larynx vibrate on ZZZZZ but not on SSSSS.  That’s because [s] is a voiceless sound, made with the vocal folds held open, and [z] is a voiced sound, where we vibrate the vocal folds.  Do it again and feel the difference between voiced and voiceless.

Now take your hand off your larynx and plug your ears and make the two sounds again with your ears plugged. You can hear the difference between voiceless and voiced sounds inside your head.

I said at the beginning that there are three crucial mechanisms involved in producing speech, and so far we’ve looked at only two:

The oral cavity is the space in your mouth. The nasal cavity, obviously, is the space inside and behind your nose. And of course, we use our tongues, lips, teeth and jaws to articulate speech as well.  In the next unit, we’ll look in more detail at how we use our articulators.

So to sum up, the three mechanisms that we use to produce speech are:

Phonation Types

·      In whisper phonation, the folds are tensed and rigid, but held slightly apart. The rigidity prevents the folds from vibrating, while the partially opened glottis forms a narrow opening which causes turbulence.

·      In breathy phonation, the folds are tensed appropriately for vibration but not fully approximated so that complete closures do not occur. This has a number of consequences: firstly that air flow continues throughout the cycle which can lead to turbulence at the glottis, secondly that the closures are less sharp, and thirdly that the vocal folds remain open for a longer portion of the cycle.

·      In creaky phonation in contrast, the vocal folds are lax but tightly approximated and this can lead to cycles which are closed for a longer proportion of the cycle and which are irregular in duration. Creaky voice is commonly found at the bottom of a speaker’s pitch range when the folds are slack anyway. A common form of creaky voice is called Diplophonia, where long and short cycles alternate.

·      In falsetto phonation, the vocal folds are extremely tense and are held in such a way as that only the internal edges of the vocal folds are able to vibrate. This means that the amplitude of phonation is small and of high fundamental frequency.


Assessing Resonance

Assess resonance with nasal-loaded sentences, nose pinch while reading non-nasal sentences, and the modified tongue anchor test. Complete an aerodynamic assessment, including maximum phonation time, subglottal pressure, and airflow.

Include an oral mechanism evaluation and measure diadochokinetic rate.

How to Assess Resonance

·       First, complete a case history

·       Oral Motor Evaluation. Observe strength, range of motion, and symmetry

·       Mandible (CN V): at rest, open (pressure), close (pressure), lateralization, protrusion, retraction

·       Lips (CN VII): at rest, protrusion, retraction, repetitive protrude/retract, puff cheeks, strength, sensitivity (CN IX)

·       Tongue (CN XII): at rest, protrusion (strength), stick up (strength), stick down (strength), lateralization (strength), retraction (CN V, XII), strength, lick teeth, lick lips

·       Velum: at rest (CN IX), prolonged “ah” (CN X), repetitive “ah” (CN X)

·       Reflexes (CN IX, X): gag, faucial arches

·       Diadochokinetic Rate (Complete three trials of each of the following): “puh puh puh” (CN VII), “kuh kuh kuh” (CN XII), “tuh tuh tuh (CN XII), “puh tuh kuh”

·       Additional observations: dentition, oral mucosa, raise eyebrows (CN VII), dysarthria, breath support, vocal quality, resonance, volitional cough, volitional throat clearing, phonation time)

·       Hypernasality. If the patient has hypernasality, you will hear nasal sounds in their repetitions. “Repeat after me…”

·       ooo

·       eee

·       sss

·       shh

·       Papa Bear blows bubbles.

·       Give Gary a gift of cookies and cookie.

·       Please pay in part for Bob’s burger.

·       Nose pinch. If the patient has hypernasality, you will hear a change in their productions whenever you pinch their nose while phonating.

“Repeat after me again, but this time I am going to pinch your nose while you speak.”

·       ooo

·       eee

·       sss

·       shh

·       Papa Bear blows bubbles.

·       Give Gary a gift of cookies and coffee.

·       Please pay in part for Bob’s burger.

·       Hyponasality. If the patient is hyponasal, then the nasal consonants will sound non-nasal. “Repeat after me…”

·       May my mom come in?

·       Nina’s new Nikes never came.

·       The mooing in the morning.

·       Nasal air emission. Place a tissue or dental mirror under the patient’s nose. If the patient has emissions, the tissue will move or the dental mirror will fog up. “Repeat after me.”

·       Peppy puppies play ball.

·       People go buy groceries.

·       Curiosity killed the cat.

·       Modified tongue anchor test. Press cheeks lightly. Cheeks will remain puffed given normal velopharyngeal functioning. “Stick your tongue out, then puff out your cheeks around your tongue.”

* Refer to an otolaryngologist as needed to assess and treat any underlying structural issues.


voice disorder occurs when voice quality, pitch, and loudness differ or are inappropriate for an individual’s age, gender, cultural background, or geographic location (Aronson & Bless, 2009; Boone et al., 2010; Lee et al., 2004). A voice disorder is present when an individual expresses concern about having an abnormal voice that does not meet daily needs—even if others do not perceive it as different or deviant (American Speech-Language-Hearing Association [ASHA], 1993; Colton & Casper, 2011; Stemple et al., 2010; Verdolini & Ramig, 2001).

For the purposes of this document, voice disorders are categorized as follows:

Voice quality can also be affected when psychological stressors lead to habitual, maladaptive aphonia or dysphonia. The resulting voice disorders are referred to as psychogenic voice disorders or psychogenic conversion aphonia/dysphonia (Stemple et al., 2010). These voice disorders are rare. Speech-language pathologists (SLPs) may refer individuals suspected of having a psychogenic voice disorder to other appropriate professionals (e.g., psychologist and/or psychiatrist) for diagnosis and may collaborate in subsequent behavioral treatment.

The complementary relationships among these organic, functional, and psychogenic influences ensure that many voice disorders will have contributions from more than one etiologic factor (Stemple et al., 2014; Verdolini et al., 2006). For example, vocal fold nodules may result from behavioral voice misuse (functional etiology). However, the voice misuse results in repeated trauma to the vocal folds, which may then lead to structural (organic) changes to the vocal fold tissue.

SLPs may also be involved in the assessment and treatment of disorders that affect the laryngeal mechanism (i.e., the aerodigestive tract) and that are not classified as voice disorders, such as the following:




     The segmental aspects of speech include

·        Respiration

·        Phonation

·        Articulation

·        Resonation


·        Respiration:

Evaluation of respiration involves assessment of patient’s ability

to control the respiratory mechanism for speaking as well as for

vegetative purposes. The general diagnostic question that needs to be

addressed is “Does this patient has sufficient air supply and

neuromuscular control of the respiratory mechanism to communicate



Observation of the patient’s use of respiratory system


An in-depth respiratory diagnosis is not required for majority of patients with voice disorders. Many times the question whether the patient is using the respiratory system efficiently enough to effect functional communication can be answered through careful observation while he or she performs the following four tasks


1.     Read aloud the standard paragraph.

               For e.g. Rainbow passage

a)     In persons with normal voice, terminations of phrases or sentences coincide with the termination of exhalation.

b)    Some patients with voice disorders attempts to continue speaking past the point where there is sufficient airflow to

effect efficient phonation, thus resulting in an increase in laryngeal tension.


2.     Perform a task of sustained vowel production.

           Patient should be able to produce an adequate amount of air and

be able to sustain phonation long enough to communicate efficiently.


3.     Perform a task of sustained /s/ and /z/ productions.

4.     Perform a task of endurance of sustained speech production.

Evaluation of respiration should be done on the patient during quiet breathing and few non-speech activities.

Breathing posture should be observed in quiet breathing conditions. If

breathing posture is not normal; it should be noticed whether the patient is

slouched in the chair or bent forward or to the side, whether the patient tend to move towards abnormal posture in a chair of wheel chair, and to come back to a normal posture whether an additional effort or assistance is required. Is the head dropped forward or does it rest on the chest.                                                             

Whether the patient is braced in a wheel chair to maintain the normal posture. Respiratory support for speech is being reduced due to all these abnormal postures.

The information about insufficient breath at rest, during physical exertion or during speech (i.e. whether the breathing is rapid, shallow or labored) should be obtained.

The pattern of breathing should be taken in to consideration. Adequate thoracic and abdominal movements also should be checked.

Whether secondaries such as shoulder movements, neck extension are present.

Irregularity in breathing rate should be taken in to consideration.


·        Phonation:


Over view of assessment


History of client


History of client should be taken with the help of written case history, interview and from the information from other professionals.


Contributing factors:


Environmental and behavioral factors medical and neurological factors. Motivation and concern.



Assessment of voice:


Pitch: Pitch of the person should be checked for high, low or normal pitch. Pitch breaks and inflections should be evaluated.

Loudness: It is important to assess whether the voice is too loud, soft or normal. Evaluation for Aphonia (Intermittent/continuous) and Inflection in the voice will have to be done.

Quality: Assessment of voice should be done to check the quality that is whether the voice is normal, harsh, breathy, and hoarse.





Serial tasks

Oral reading

Speech sampling

Maximum phonation duration

S/Z ratio

Velopharyngeal function

Stimulability of improved voice

Use of instrumentation

Oro-facial examination

Hearing assessment

Determination of diagnosis

Providing information [Written report, interview etc.]




A screen for voice disorders can be accomplished with a few quick and easy tasks. For example, have the client imitate words or phrases, count, recite the alphabet, read a short passage or talk conversationally. Wilson (1987) recommended these four steps to screen children for voice disorders:

1.     Count from 1 to 10.

2.     Read orally for one minute.

3.     Produce continuous speech for one minute.

4.     Prolong the following vowels for five seconds each:  /a/, /i/, /u/ and /ae/.

This sample is then evaluated according to the screening guidelines known as Buffalo-voice profile system which is given by D.K Wilson (1987).

The Buffalo voice profile system:

Author: D.K Wilson

This system provides a criteria and profile for rating the parameters of voice. In Buffalo voice profile evaluate 12 parameters of voice. In that mainly they consider four parameters for assessment. They are Laryngeal tone, Pitch, Loudness and Nasal resonance. They classified variation from the normal in to three they are Slight, Moderate, Severe variations.

The screening procedure from the Boone voice program for children (Boone, 1986) is also useful. Boone utilizes three rating scales to evaluate the basic parameters of voice. If any response not scored as normal, the client fails the screen and referred for a complete voice evaluation.

                    In three point rating scale Pitch, Loudness, Quality, Nasal resonance and Oral resonance are the categories which is to be checked and according to performance it is separated as  –ve to N to +ve in which –ve is too low, +ve is too high and N is normal.


Buffalo voice profile:

1.     Laryngeal tone





2.     Laryngeal tension


Hyper tension

Hypo tension

3.     Vocal abuse


4.     Loudness

Normal/too loud/too soft

5.     Pitch


6.     Vocal inflection


7.     Pitch breaks

None/amount of pitch breaks

8.     Diplophonia


9.     Resonance

Normal/hyper nasal/hypo nasal

10.  Nasal emission


11.   Rate of speech


12. Overall voice efficiency



Seven point rating scale is used to evaluate

1-4 slight variation

4-7 moderate variation

7 severe variations


GRBAS rating scale:

Author: Hirano, 1981

This test is used to check the various parameters in pathological voice cases.

1.     4 point rating scale.

2.     5 parameters; they are mainly Grade, Rough, Breathy, Asthenia and strain

For each dimension a score of 0 to 3 is used, where ‘0’ represents non hoarse or normal and ‘3’ represents extreme severity. Together these individual ratings combine to form a profile of the patient’s voice quality.




4.2 Basics of Articulation and phonology (active and passive articulators; classification of vowels and consonants; assessment of articulation)


Speech sound disorders is an umbrella term referring to any difficulty or combination of difficulties with perception, motor production, or phonological representation of speech sounds and speech segments—including phonotactic rules governing permissible speech sound sequences in a language.


Speech Sound Disorders Umbrella

An articulation disorder refers to problems making sounds. Young children often make speech errors. For example, some children have difficulty making the “r” sound and substitute this sound with a “w” (e.g saying “wabbit” for “rabbit). Speech sounds can be substituted, changed, added, or deleted.


Age (years)

50% of children have sound mastered

90% of children have sound mastered


p, m, h, n, w, b



k, g, d, t, ng, f, y



r, l, s, ch, sh, z

p, m, h, n, w, b


j, v, voiceless th

k, g, d, f, y


voiced th




t, ng, r, l



ch, sh, j, voiceless th



s, z, v, voiced th, zh


Place of articulation

The initial consonants of the English words pan /»pQn/, tan /»tQn/, and can /»kQn/ are all voiceless stops, but they differ in place of articulation:

The place of articulation of a speech sound is the place in the oro-pharyngeal tract with the strongest constriction during the production of the sound.

To produce /p/, we make a complete closure between the lower lip and the upper lip; to produce /t/, we make a complete closure between the tip or the blade of the tongue and the alveolar ridge; to produce /k/, we make a complete closure between the back of the tongue and the velum.

At each place of articulation, there is a constriction between an active articulator and a passive articulator. The active articulators are the lower lip and the tongue, while the passive articulators are the upper lip, the upper teeth, the roof of the mouth, and the rear wall. Different languages utilize a different number of articulators, and the most important ones are presented in TABLE below.

The active articulator labium (‘lip’ in Latin) is the lower lip; the apex (‘tip’ in Latin) is the tip of the tongue; the lamina (‘blade’ in Latin) is the blade of the tongue; while the dorsum (‘back’ in Latin) is the back of the tongue. Sometimes it is practical to use corona as a common denominator for the apex plus the lamina.

The passive articulator labium is the upper lip; the dentes (‘teeth’ in Latin) is the upper teeth; the alveoli (‘basins’ in Latin) is the alveolar ridge or the gums, minus the back half, which is called the postalveoli; the palatum and the velum are the hard palate and in FIG. 16, respectively. The lower part of the velum is the uvula.

There are adjectives corresponding to the nouns in Table 21: labial, apical, laminal, coronal, dorsal; labial, dental, alveolar, postalveolar, palatal, velar, uvular.

Terms for active and passive articulators are combined to define places of articulation. For example, when the active articulator is apical and the passive articulator is alveolar, the place of articulation is apical-alveolar. Two terms are irregular; we use bilabial instead of «labial-labial» and labiodental instead of «labial- dental». Quite often it is practical to mention only the active or only the passive articulator. For example, alveolar is a common denominator for apical-alveolar and laminal-alveolar. For palatal, velar, and uvular sounds, it is superfluous to mention the active articulator, which in these cases is always the dorsum.


A phonological disorder involves patterns of sound errors.  For example, a sound produced in the front of the mouth like “t” and “d” is substituted for a sound produced in the back of the mouth like “k” and “g” (e.g. saying “tat” for “cat” or “do” for “go”).  Many phonological process errors are common and appropriate but at a certain age are no longer expected.


Age (years)

Phonological process eliminated



velar assimilation
nasal assimilation
stopping (f and s)
final consonant deletion

"kack" vs "tack"
"nunny" vs "bunny"
"pun" vs "fun"
"no" vs "nose"


stopping (z and v)
cluster reduction (without s)
weak syllable deletion

"tar" vs "car"
"doo" vs "zoo"
"ship" vs "chip"
"keen" vs "clean"
"nana" vs "banana"


stopping (sh, ch, j, th)
cluster reduction (with s)

"dump" vs "jump"
"top" vs "stop"



"weg" vs "leg"



The cause of some articulation and phonological disorders is known; however, in most children we do not know.  A number of studies have identified the following risk factors (Campbell et al., 2003; Fox, Dodd, & Howard, 2002; Harrison & McLeod, 2010):

·      Male

·      Pre- and perinatal problems

·      Oral sucking habits (e.g. excessive sucking of pacifiers or thumb)

·      Ear, nose, and throat problems

·      A more reactive temperament

·      Family history of speech and language problems

·      Lack of support for learning in the home

The American Speech Language Hearing Association (ASHA) lists some indicators that a child may need intervention including:

By 3 years cannot:

·      Be understood by family and/or caregivers

·      Correctly produce vowels and such sounds as p, b, m, w in words

·      Repeat when not understood without becoming frustrated

By 4 years cannot:

·      Be understood by individuals with whom they do not associate regularly

·      Be understood by family and/or caregivers

·      Correctly produce t, d, k, g, f

·      Be asked to repeat without becoming sensitive

By 5 years cannot:

·      Be understood in all situations by most listeners

·      Correctly produce most speech sounds

·      Be asked to repeat without exhibiting frustration

Vowels and Consonants

While stops, nasals, fricatives, and liquids are consonants, we are now going to take a look at the vowels. The prototypical consonant has a relatively strong constriction in the oro-pharyngeal tract, while such a constriction is absent in vowels. Furthermore, the prototypical consonant belongs to the margin of a syllable (cf. the next paragraph), while the prototypical vowel belongs to the nucleus of a syllable.

Practically all languages distinguish at least three vowels, while the most common vowel inventory in the world is /i e A o u/, which is found for example in Swahili and Fula. We shall come back to a more detailed description of differences in vowel quality later.

Vowels are traditionally analyzed with a terminology that differs from that used for consonants. Three main dimensions are used in this classification:


  Horizontal tongue position:     front / central / back

  Vertical tongue position:          close / close-mid / open-mid / open

  Lip position:                                  unrounded / rounded


Before explaining how this terminology is used, we present TABLE 23, where some vowel symbols are placed in a chart in accordance with there classification. All cells could have been filled, but we have left out some less commonly used symbols.


'Vowel' and 'Consonant' are defined in phonetic and linguistic terms. When we define them in phonetic terms, we do so with reference to their production, when we define them in linguistic terms we refer to their function in a given language. In phonetic terms, a vowel is a sound for whose production the oral passage is unobstructed, so that the air can flow from the lungs to the lips and beyond without being stopped, without having to squeeze through a narrow construction, which would cause audible friction.

A consonant, on the other hand, is a sound for whose production the air current is completely stopped, or is forced through a narrow constriction which causes audible friction. Let us look at a few examples. For instance the English word 'bar'. The vowel represented by the letters 'ar', is produced with the mouth wide open and the tongue low in the mouth. The air passage is unobstructed and the air, passes into the atmosphere outside without any friction. The sound is therefore called a vowel. The sound represented by the letter 'b' is produced by a complete closure of the oral passage of air at the two lips, and then the sudden release of the air held behind the closure. Thus, there is an obstruction, for a while, to the flow of air from the lungs. Because of this obstruction the resultant sound is regarded as a consonant.

Take for instance, the English word see. The sound represented by the letter 'see' is produced by raising the front of the tongue fairly high, so that the passage between the tongue and the hard palate is fairly narrow but not narrow enough to cause any audible friction in the sound produced. The sound is therefore, regarded as a vowel. In the production of the s-sound in the English word see, the tip and blade of the tongue rise so high towards the teeth ridge that the passage of air is very narrow and the air passes out with audible friction. Thus the sound is a consonant


Practical Test of Articulation and Phonology (PTAP)

The PTAP is an easy-to-administer articulation and phonology assessment or screening tool that takes only 2 to 8 minutes to administer. It helps a clinician make a differential diagnosis among articulation delay, phonological disorder, or a combination of the two. The PTAP assesses 23 phonemes and 7 phonological processes with 36 color photos (4 panels of 9 pictures each) that show real-life objects of basic vocabulary words. The protocol has all of the transcription and phonological analysis provided for each word, making the scoring quick and easy. An oral motor exam is also included.

Phonological processes assessed include:

·  Fronting

·  Stopping

·  Cluster reduction

·  Initial consonant deletion

·  Weak consonant deletion

·  Final consonant deletion

·  Syllable reduction

Administration and Scoring
The PTAP is an individually administered, empirically based assessment that can be administered and scored in approximately 5 to 15 minutes. The child’s articulation errors are marked on the transcribed target words on the form. Any errors of phonological processes are recorded in the column of that phonological process on the protocol.

Normal articulation is a series of complex actions. Accurate articulation requires exact placement sequencing, timing, direction and force of articulation. These occur simultaneously with precise air stream alteration, initiation or halation of phonation and velopharyngeal action.

Articulation problems results from organic or functional etiologies.

The primary purposes of an assessment of articulation and phonological processes include:

·      Describing the articulatory or phonological development and status of the client.

·      Determining whether the individual’s speech sufficiently deviate from the normal expectations to warrant concern or intervention.

·      Identifying factors that relate to the presence or maintenance of the speech disorder.

·      Making prognostic judgments about change with and with out intervention.

·      Monitoring changes in articulatory and phonological abilities and performance across time. (Adapted from Bernthal and Bankson 1988).


History of the client:


History of the client should be taken with the help of written case history, interview and from the information from the other professional.

Information about hearing ability of client should be available. Presence of any medical or neurological factors should be revealed out. Dentition of the client should be checked. Maturation and motor development should be checked.

Assessment procedure for articulation and phonological process include:



Articulation tests

Speech sampling

Stimulability of errors


In the analysis,

Number of errors should be counted. Error types such as substitutions, omissions, distortions, additions should be analyzed directly from the client or from the speech sample. Consistency of errors should be noted. Intelligibility rate of speech should be assessed. Oro-facial examination should be done.

In the case of children who are having cleft palate articulation, resonance and intelligibility should be checked.

Perceptual speech assessment is central to the evaluation of speech outcomes associated with cleft palate and velopharyngeal dysfunction.

In the analysis of articulation disorders the articulatory system should be assessed in the state of rest, during movement and speech.

Face at rest:

-         Symmetry/ Asymmetry.

-         Expressionless/ Mask like/ Unblinking.

-         Abnormal spontaneous involuntary movements.

-         Deviation of lips.


Face at sustained posture:

-         Strength

-         Speed             Application to lip rounding

-         Range             retraction, puffing, puckering.

-         Steadiness

-         Tone

-         Accuracy    


Assessment of articulatory structures:




Lips should be checked for symmetry at rest.

Tongue should be checked for its size, surface and frenulum, tongue thrust.

Jaw should be checked for occlusion, size, and symmetry and for hanging down.

Teeth should be checked for alignment, symmetry, missing teeth and bite.

Hard palate: high arched symmetry and any organic pathology.

Soft palate:   symmetry, bifid uvula, absence of uvula and any other organic pathology.




Lips: Checking for protrusion, retraction, rounding, and puffing cheeks and bite lower lip.

Tongue: Checking for upward downward movement, lateral movements, tongue to cheek strength, lateral movements with in the mouth.

Jaw: Checking for clenching, chewing, lateral movements, upward and downward movements.

Soft palate: Checking upward and downward movements during phonation, gag reflex.


Alternate motion rate (AMR):

AMRs, or diadachokinetic rates, are very useful for determining the speed and regularity movements of the jaws, lips and anterior and posterior tongue. They also permit assessment of articulatory precision, the adequacy of velopharyngeal closure, and respiratory and phonatory support for sustaining the task. These latter observations are usually secondary. “The primary value of AMRs is for assessing speed and regularity of rapid, repetitive articulatory movements”.

The patient should be instructed to take a breath and repeat “puh-puh-puh” for as long and steadily he can. This should be followed by a 2 to 3 second example by the clinician. Patient can be told to stop when the sample is sufficient for clinical judgments.

               When repetitions of “puh” are completed the patient should be asked to repeat the task for “tuh” and “kuh”. AMRs for other consonant-vowel (CV) syllable can be pursued if other places and manner of articulation are of interest.


Sequential motion rate (SMR):

SMR is a measure of ability to move quickly and in proper sequence from one articulatory position to another. Relative to AMRs, sequencing demands for SMRs are heavy; for this reason, “SMRs are particularly useful when Apraxia of speech is suspected”.

The patient should be asked to take a deep breath and repeat “puh-tuh-kuh” over and over again until the clinician tells the client to stop. This should be allowed by 2 to 3 second example by the clinician. Some people need reinstruction in the sequence and slow or unison practice is sometimes necessary for the task to be grasped. When the sequence cannot be learned, repetition of “buttercup, buttercup, buttercup….” is acceptable, but the meaningfulness of the word makes it a simpler task than puh-tuh-kuh.

Articulation during speech:

The misarticulation of phoneme of a word is assessed using standardized test materials such as picture articulation tests, Kannada articulation tests.

Main four misarticulation types are obtained from the tests


a)     Substitution

b)    Omission

c)     Distortion        

d)    Addition


The misarticulated sound is marked in initial, medial and final position.

Speech sampling is especially important for accurately diagnosing disorders of speech sound production. After obtaining a speech sample, analysis should be done with a focus of following behaviors.


Ø Number of errors

Ø Error types

Ø Consistency of errors between the speech sample and the articulation tests with in the same speech sample, and between different speech samples.

Ø Correctly produced sounds

Ø Intelligibility

Ø Speech rate

Ø Prosody


Articulation screening tests:


                   Frequently, a clinician will do a screening to determine if a more comprehensive assessment of this nature is warranted. Screening procedures are not designed to determine the need for the direction of therapy, but rather to differentiate the individuals who merit further evaluation from those for whom further evaluation is not indicated. Typical screening situations might include

1.     Screening children at a pre-school or “Kindergarten round up” to determine whether they have age appropriate phonological skills.

2.     Screening children in grade 3 (by which time maturation should have resolved most developmental errors).

3.     Screening college students preparing for occupation, such as teaching or broadcast journalism, which require certain speech performance standards.

4.     Screening the phonological status of referred children and adults for a suspected communication impairment.

                     In screening, individuals are not identified as candidates for therapy but rather are simply identified as needing further assessment. Instruments used for screening consists of a limited sampling of speech sound productions, which can usually be administered in five minutes or less. Screening measures can be categorized as informal or formal.


Informal screening measures:

Informal screening measures are usually devised by the examiner and are tailored to the population being screened. For example, with a group of kindergarten children, the examiner might ask each child to

1.     State his or her name and address.

2.     Count to ten; name the days of week.

3.     Tell about a television show.


If the subjects are adults, the examiner might ask them

 to do one or both of the following:

1.     Frequently misarticulated sounds, such as /s/, /r/, /l/, and /θ/.

    For e.g. “I saw sally at her seaside house; Rob ran around the

    orange car.”

2.     Read a passage with a representative sample of English speech sounds such as the “Grandfather passage” or the “Rainbow passage”.

\Criterion for failures of informal screening is usually

determined by the examiner. An often used rate of thump is “If in doubt, refer”. In other words if one suspects that the client’s speech sound system is not appropriate for his or her age and/or language community, one should refer for a more complete assessment.

Formal screening measures include published elicitation procedures for which normative data and/or cut off scores are often available. These formal measures are of three types: (1) Tests that are part of a more comprehensive single- word articulation tests. (2) Tests designed solely for screening phonology and (3) Tests with screen phonology as well as other aspects of language. Tests designed explicitly for screening phonology are most frequently used when screening phonology is the primary goal.

The following are the formal phonology screening tests:-


Templin-Darley Screening Tests:- (Templin and Darley 1969).

This test consists of 50 items from Templin Darley test of articulation. Intention of this test is to elicit 22 single consonants, 26 consonant clusters, one vowel and one consonant- vowel combination. Norms and suggested cut off scores for children aged 3 through 8 years are provided.


Quick Screen of Phonology (QSP) (Bankson and Benthal, 1990).

This test consists of 28- picture naming items, with each word assessing sound in more than one content (usually initial and final). 23 phonemes are screened plus three consonant clusters. These items were selected because of their correlation with the overall norms of the Bankson-Benthal test of phonology. Percentile ranks and standard scores are provided for children ages 3; 0 through7; 11 years on the QSP.




Denver Articulation Screening Test (Drumwright 1971).

This instrument was designed specifically for screening phonological status in Algo, black and Mexican-American children. Responses are elicited imitatively. The examiner is asked to judge intelligibility on 4 point scales, with one being “easy to understand” and 4 being “can’t evaluate” children are ranked “normal to abnormal,” depending on composite articulation and intelligibility scores.


Screening Deep Test Articulation (McDonald, 1968).

This test consists of 90 items and is similar in form to the more complete Deep Test of articulation (McDonald, 1964a). It uses pairs of pictures to elicit ten productions of each nine commonly misarticulated consonants. This allows for production of specific consonants in a variety of contexts. The normative data provided for children from kindergarten through grade 3 are design to identify those children unlikely to develop mature articulation with out intervention.


Predictive Screening Test of Articulation (Van Riper and Erickson, 1969).

This test was designed not only for screening but also for predicting whether or not first-grade children are likely to correct their speech sound errors with out intervention. In other words it was designed both to indicate the need for additional testing for those who fail to obtain a cut-off score and to allow the examiner to make prognostic statements about the likelihood to self correction of speech sound errors. Stockman and McDonald (1980) reported that this test may have greater predictive value for those first graders who misarticulate specific consonant sounds since those sounds occur frequently in the test.


Fluharty Speech and Language Screening Test for Preschool Children (Fluharty, 1978).

This test was designed for children, ages 2 through 6 years. The phonology portion of the test uses 15 objects to elicit 19 target sounds. Some stimulus items are designed to assess a single segment; other items assess two sounds. Cut-off scores to indicate the need of further testing are included.


Preschool Language Scale (Zimmerman, Steiner, and Pond, 1979).

This test was designed for children, ages 1 through 7. The phonology portion of the test consists of 20 imitated words that test 18 speech sounds in initial, medial, and final word positions. Performance level expected for children are provided.


Iowa Pressure Articulation Test

The Iowa Pressure Consonant test is a subtest of the Templin-Darley Test of Articulation (Templin and Darley, 1969). It consists of 43 words containing pressure consonants. It is a useful assessment tool when velopharyngeal inadequacy is suspected as these pressure consonants require the build-up of intra oral pressure and, therefore adequate velopharyngeal function.

Phonological Processing Assessment

Phonological processing is the use of the sounds of one's language (i.e., phonemes) to process spoken and written language (Wagner & Torgesen, 1987). The broad category of phonological processing includes phonological awarenessphonological working memory, and phonological retrieval.

All three components of phonological processing (see definitions below) are important for speech production and for the development of spoken and written language skills. Therefore, it is important to assess phonological processing skills and to monitor the spoken and written language development of children with phonological processing difficulties.


4.3 Suprasegmental aspects of speech and its assessment


 The speech mainly depends up on two aspects one segmental and other is supra segmental. These aspects are very important for correct production of speech.




     Segmental aspects of speech includes

·      Respiration

·      Phonation

·      Resonation

·      Articulation


1.     Respiration

Normal respiration is an important function in delivering a normal speech. While speaking the inhalation will be much more quick and bit more deep. After each inhalation the exhalation of the air should be controlled very precisely in order to maintain just the correct rate of airflow and the amount of pressure needed to “drive” the speech mechanism.


2.     Phonation:


The larynx a delicate and very important part of speech mechanism is suspended in our neck beneath the hyoid bone (the only bone in our body which does not articulate with any other bone) and above the trachea (or wind pipe). The sound of human voice (phonation) is produced by paired vocal folds, one on the left side, and one on the right side, which lie within the major cartilage of larynx, the thyroid cartilage.


3.     Resonation:

All voices have sound energy present over a very wide range of frequencies. Normally there are greater amounts of energy present in the fundamental and lower frequency harmonics than in the higher frequencies.

The process of resonation enables us to produce recognizably different vowels and some consonant sounds. By altering configurations of our throat and mouth cavities through movement of tongue, lips, and jaw, we create resonators that will emphasize energy at some frequencies and suppress energy at others.

Above the pharynx and oral cavity is the nasal cavity a resonating chamber that also can add to (or subtract from) the original character of glottal tone. In producing the nasal consonants of course nasal resonance is an essential and distinguishing feature.


4.     Articulation:


The articulators of speech sounds, the consonants, vowels and diphthongs that are the basic phonemic elements of our language, demands incredibly intricate coordination of the tongue, lips, mandible, and velum. And the movements of these structures occur in synchrony with those of the respiratory and phonatory systems.




Suprasegmentals are characteristics of speech that involves larger units, such as syllables, words, phrases, or sentences. Among the suprasegmentals are stress, intonation, loudness, pitch level, juncture, and speaking rate. Briefly defined, the suprasegmentals also called prosodies, or prosodic features, are properties of speech that have a domain larger than single segment.


STRESS:  Stress refers to the degree of effort, prominence, or importance given to some part of an utterance.

                             Eg: Be sure to take the RED car


INTONATION:  Intonation is the vocal pitch contour of an utterance, that is, the way in which the fundamental frequency changes from syllable to syllable and even from segment to segment.


LOUDNESS:  Loudness is related to sound intensity or to the amount of vocal effort that a speaker uses.


PITCH LEVEL: Pitch level is the average pitch of a speaker’s voice and relates to the mean of the fundamental frequency of an utterance.


JUNCTURE:  Juncture sometimes called “Vocal punctuation”, is a combination of intonation, pausing and other suprasegmentals to mark special distinctions in speech or to express certain grammatical divisions.

                           Eg: “Let’s eat, Grandma.”


SPEAKING RATE:  The rate of speaking is usually measured in words per second, syllables per second, or phonemes per second. As speaking rate increases, segment duration generally become shorter, with some segment affected more than others.



A written case history is starting point of understanding the client’s and their communicative problems. A case history enables the clinician to anticipate those areas that will require assessment.



So far all the sounds we’ve been considering are segments: the individual speech sounds that we represent with IPA symbols. But when we speak, we also include sounds that are above or beyond the level of the segments. This sound information is called prosody, or suprasegmental information, and it makes up the rhythm, timing, meter, and stress of the words and sentences that we speak. The primary pieces of suprasegmental information are the pitch of sounds, the loudness, and the length.

The pitch of a sound is how high or low it is.  We produce high pitched sounds when our vocal folds have a high-frequency vibration, and when our vocal folds vibrate more slowly, the resulting sound is lower in pitch.

Some languages use pitch information to signal changes in word meaning. If a language uses pitch this way, the pitch information is called tone

Languages also use pitch in another way, not to change word meaning, but to signal information at the level of the discourse, or to signal a speaker’s emotion or attitude. When pitch is used this way, it’s called intonation rather than tone. English uses pitch for intonation — let’s look at some examples.

Sam got an A in Calculus.
Sam got an A in Calculus!
Sam got an A in Calculus?
Sam? got an A? in Calculus?

All of these sentences contain the same words (and the same segments) but if we vary the intonation, we convey something different about the speaker’s attitude towards the sentence’s meaning. Notice that we sometimes use punctuation in our writing to give some clues about a sentence’s prosody.

Another component of suprasegmental information is the length of sounds. Some sounds are longer than others. Listen carefully to these two words in English. beat, bead. The vowel sound in both words is the high front tense vowel [i]. But in bead, the vowel is a little longer. This is a predictable process in English — vowels get longer when there’s a voiced sound in the coda of the syllable. The diacritic to indicate that a segment is long looks a bit like a colon.

So a sound can change in length as the result of a predictable articulatory process, or, like intonation, length can signal discourse-level information about an utterance. Consider the difference between, That test was easy, and, That test was eeeeeeeeeeeeeaaasyyyyyyyy. Some languages use length contrastively, that is, to change the meaning of a word.  In these words in Yapese, a language of the Western Pacific region, you can see that making a vowel long leads to a completely different word with a new meaning. In these words from Italian, consonant length can change the meaning of a word, so fato means fate, but fatto means fact.

To sum up, suprasegmental information, also known as prosody, is that sound information that’s above the level of the segment. It consists of pitch, loudness, and length.  Many languages use prosody to provide discourse-level information, and some languages also use prosody to change word meanings.


The supra segmentals include:

·        Stress

·        Rhythm

·        Intonation


·        Stress:



Jones (1956) defines stress as the degree of force with which the sound or a syllable is uttered.

        In other way stress can be defined from the listener’s point of view and speaker’s point of view. From the listener’s point of view stress can be defined as perceived loudness of a syllable/word. From the speaker’s point of view it can be defined in terms of greater muscular effort and comparatively greater force.


o   Types of stress:

In traditional phonetics, stress has been frequently divided in to

dynamic or expiratory stress and musical or melodic stress.

           Jones (1950, 1972) listed four types of stress they are Level stress, Crescendo stress, Diminuendo stress and Crescendo-Diminuendo stress. All these four types of stress have been claimed to exist in Serbo Croatian language.

·      Phonemic stress or word level stress: This kind of stress presupposes that the domain of stress is a word.

·      Morphological stress: The position of stress is fixed with regard to a given morpheme.

·      Sentence level stress: When stress functions at a sentence level, it does not change the meaning of any lexical item but it increase the relative prominence of one of the lexical item. There are three types of sentence level stress.

1.     Primary stress (non-emphatic): Each sentence automatically

has a primary stress. Here, in a sentence the important syllable or word is stressed.

2.     Contrastive stress: This occurs in a sequence of sentences

with parallel constituents that are filled with different phonemes.

3.     Emphatic stress: This is used to distinguish a sentence from

its negation.


Measurement of stress:


Perceptual:  Perceptually one can do four kinds of stress analysis.

·        Stress is used in a place of an unstressed syllable/word,

·        Unstressed for a stressed syllable/word,

·        Equal stress on all syllables, and

·        No stress on any syllable.


Acoustic: Acoustically Fo, amplitude, phoneme duration, 2nd formant frequency of a stressed phoneme can be measured and compared with the counterpart unstressed phoneme.


Analyzing stress production:  A simple way to analyze stress is to record spontaneous speech sample. By listening to the speech sample, one can analyze for +/- ,-/+, no stress or equal stress on all syllables.

A second way is to use prerecorded material. The subject can be audio presented with the phrases/sentences and asked to repeat. The repeated sample can be analyzed in comparison with the original. An acoustic analysis is also possible, as reference is available.

Analyzing stress perception:  AB design can used where A is a phrase with a stressed word and B is a phrase with an unstressed word. Subjects have to say whether A and B are same or different. ABA design can be used where subject has to say which is the odd phrase. Synthetic materials can be used where the individual parameters like Fo, intensity or duration can be altered. The original and the synthetic phrase can be paired and given for a discrimination task.     


The analysis of stress can be made by the analyzing

·      Increase in pitch

·      More force

·      Increase in loudness

·      Sound quality occurring in syllables

·      Increase in duration

·      Pause before and after stress

  (It maybe strong or weak stress)

Research and methodological issues: The parameters acknowledged as co-signals to stress also apparently share in signaling another speech attribute namely intonation, a further problem in investigating stress is that there are several types of stress these are the reason for the paucity of the studies in the area of stress. However the analysis of stress is very important as it is deviant in hearing impaired, aphasic, dysarthric, learning disabled, mentally retarded and brain damaged.    



There are two prosodic features which describe the temporal

characteristics of a spoken utterance, tempo and rhythm. Tempo is the rate at which utterance is spoken and rhythm of an utterance is the pattern of time intervals which elapse between the occurrences of stressed syllables.

The term rhythm is derived from the Greek word “Rhuthmos”  where “rhu”  means flow. It’s a pattern of movement which occurs more or less temporal regularity. “It is a swing or a balance bodily movement, music or verb or phrase (Encyclopedia Britanica, 1965)”.

The sense of rhythm is not properly developed or disrupted in the hearing impaired, stuttering, cluttering, dysarthria, aphasia and verbal apraxia (Starkweather, 1987). 


Tests of rhythm:

T-TRIP [Tennessee test of Rhythm and Intonation pattern ( Koike & ASP, 1981)]

It is a three part supra segmental test with 25 items. It is spoken and recorded with different rhythm and intonation pattern. The test item consists of the nonsense syllable /ma/ spoken and recorded with different rhythm and intonation patterns. Items 1 to 17 are for testing rhythm. In part 1 the rhythm section had items 1-14 had 2-6 syllables that varied in stress and tempo. In part 2, item 15, 16 and 17 had a faster tempo with 3 syllables per beat and produced 3-9 syllables.


·       Inability to achieve an adequate control over frequency.

·       Perception of rhythm only. Intensity and durational cues are utilized and not F0.

·       Difficult to test.


Synthetic test of rhythm [Jayanthi Ray, 1993]

This test had 17 synthetic stimuli varying in Fo, intensity and duration. There are three parts in the test. In part 1, 6 stimuli with change in Fo (increasing in steps of 10Hz steps) were prepared. Intensity and duration were kept constant. In part 2, 6 stimuli with change in intensity (increase in 10dB steps) were prepared. F and duration were kept constant. In part 3, five stimuli with change in duration (increase in 10 ms steps) and constant Fo and intensity were generated. These stimuli were given for imitation in 20 kannada speaking adults and 40 children.


This test could be used

·      As a clinical diagnostic tools in order to explore the suprasegmental functioning in patients having dysprosodia.

·      As a therapeutic tool for facilitating rhythmic speech and hence enhancing speech intelligibility in those, who have arhythmia.

Measurement of rhythm:


Speech production:

One can record speech sample and tap for rhythm and indicate the taps on a transcribed material. The speech samples of normals can be compared with that of the clinical population for production of rhythm.

Acoustically one can measure Fo, intensity and duration of the tapped syllable. These acoustic measures in clinical population can be compared to that of normals.



Intonation is an important feature of prosody. Intonation is defined as the fundamental frequency (pitch) variations in phrase/clauses/sentences in a temporal dimensions.

Intonation is the movement of pitch in an utterance. It is different from other utterances in that it is meaningful. A rising pitch indicates a question and a falling a statements.    

It is the variation of speech pitch as a function of time (Collier, 1991). Intonation plays an important role in intelligibility and naturalness of synthetic speech (Olive et al, 1974).


Tests of intonation:

T-TRIP [Tennessee test for rhythm and intonation patterns ( Koike and Asp, 1971)]

It is a 25-items test with 18-25 items for intonation. Nonsense syllable as spoken in various intonation patterns by an adult male trained in phonetics is used. 



4.4 Milestones of speech development in typically developing children


The ability to hear is essential for proper speech and language development. Hearing problems may be suspected in children who are not responding to sounds or who are not developing their language skills appropriately. The following are some age-related guidelines that may help to decide if your child is experiencing hearing problems.

It's important to remember that not every child is the same. Children reach milestones at different ages. Talk your child's healthcare provider if you are suspicious that your child is not developing speech and language skills correctly. The National Institute on Deafness and Other Communication Disorders and other experts list the following age-appropriate speech and language milestones for babies and young children.

Milestones related to speech and language


Birth to 5 months

·       Coos

·       Vocalizes pleasure and displeasure sounds differently (laughs, giggles, cries, or fusses)

·       Makes noise when talked to

6 to 11 months

·       Understands "no-no"

·       Babbles (says "ba-ba-ba")

·       Says "ma-ma" or "da-da" without meaning

·       Tries to communicate by actions or gestures

·       Tries to repeat your sounds

·       Says first word

12 to 17 months

·       Answers simple questions nonverbally

·       Says 2 to 3 words to label a person or object (pronunciation may not be clear)

·       Tries to imitate simple words

·       Vocabulary of four to 6 words

18 to 23 months

·       Vocabulary of 50 words, pronunciation is often unclear

·       Asks for common foods by name

·       Makes animal sounds, such as "moo"

·       Starting to combine words, such as "more milk"

·       Begins to use pronouns, such as "mine"

·       Uses 2-word phrases

2 to 3 years

·       Knows some spatial concepts, such as "in" or "on"

·       Knows pronouns, such as "you," "me" or "her"

·       Knows descriptive words, such as "big" or "happy"

·       Uses 3-word sentences

·       Speech is becoming more accurate, but may still leave off ending sounds. Strangers may not be able to understand much of what is said.

·       Answers simple questions

·       Begins to use more pronouns, such as "you" or "I"

·       Uses question inflection to ask for something, such as "my ball?"

·       Begins to use plurals, such as "shoes" or "socks" and regular past tense verbs, such as "jumped"

3 to 4 years

·       Groups objects, such as foods or clothes

·       Identifies colors

·       Uses most speech sounds, but may distort some of the more difficult sounds, such as l, r, s, sh, ch, y, v, z, th. These sounds may not be fully mastered until age 7 or 8.

·       Uses consonants in the beginning, middle, and ends of words. Some of the more difficult consonants may be distorted, but attempts to say them

·       Strangers are able to understand much of what is said

·       Able to describe the use of objects, such as "fork" or "car"

·       Has fun with language; enjoys poems and recognizes language absurdities, such as, "Is that an elephant on your head?"

·       Expresses ideas and feelings rather than just talking about the world around him or her

·       Uses verbs that end in "ing," such as "walking" or "talking"

·       Answers simple questions, such as "What do you do when you are hungry?"

·       Repeats sentences

4 to 5 years

·       Understands spatial concepts, such as "behind" or "next to"

·       Understands complex questions

·       Speech is understandable, but makes mistakes pronouncing long, difficult, or complex words, such as "hippopotamus"

·       Uses some irregular past tense verbs, such as "ran" or "fell"

·       Describes how to do things, such as painting a picture

·       Lists items that belong in a category, such as animals or vehicles

·       Answers "why" questions

5 years

·       Understands time sequences (for example, what happened first, second, or third)

·       Carries out a series of 3 directions

·       Understands rhyming

·       Engages in conversation

·       Sentences can be 8 or more words in length

·       Uses compound and complex sentences

·       Describes objects

·       Uses imagination to create stories



4.5 Speech Intelligibility: concept, factors & assessment


Communication is an important aspect of quality of life, and speech is the primary means of human communication. In adults with disabilities, speech communication is often troubled by disordered speech production and/or impaired hearing, resulting in miscommunication and consequently impairing social interactions, possibly behavioural problems and isolation.

Speech intelligibility can be defined as how clearly a person speaks so that his or her speech is comprehensible to a listener. Reduced speech intelligibility leads to misunderstanding, frustration, and loss of interest by communication partners. As a result, communication decreases or remains at a low level.

Intelligibility of speech is the percentage of speech that a listener can understand. If you can only understand half of what a child is saying then their speech intelligibility rating would be 50%.

​Speech Intelligibility changes with a child’s age. Speech development begins with babbling and then speech matures until older children can say all the sounds in their primary language/s and everyone can understand them.

There are 3 main areas to consider when looking at speech intelligibility
1 Articulation
Toddlers do not have the fine motor control of their tongues and lips to be able to say all the sounds correctly and so their speech is characterised by speech sound developmental errors.

​These are called articulation errors. For example, sounds like the “r” sound can be difficult for toddlers. The “r” sound requires a fine curling of the tongue and young children often will substitute the “r” sound for an easier sound (e.g., w). You will often hear a 2 year old say “wed” for “red”.

2 Motor speech co-ordination 

​Young children also cannot move their tongue and lips fast enough to keep their speech clear as they try to say longer words and sentences. This is called motor speech co-ordination. It is a bit like doing buttons up. Young children may be able to do easy buttons but might take a long time. As their fine motor co-ordination increase their ability to dress becomes easier and faster.

3 Phonological Processes
​Toddlers use more phonological processes that reduces speech intelligibly than older children. As toddlers do not have the oral motor skills to say words 100% correctly, they use speech sound patterns that simplify words.

​One example of a phonological process is called “cluster reductions”. Speech sound clusters like “sp, sk, dr, bl” are very difficult for young children to say. Most young children reduce the cluster to one sound (e.g., “dep” for “step”, “back” for “black”). Young children use lots of phonological processes reducing speech intelligibility. As children get older they use less and less and so speech intelligibility improves.

​As speech develops children use less articulation errors, their rate & co-ordination of speech improves and they use less phonological errors.

Children with speech delays and speech disorders will often have lower speech intelligibility percentages than same age children.

​If a 3 year old child has speech delays they may be still making speech sound errors like a 2 year old would be making. This can reduce their speech intelligibility for their age.

​Speech disorders such as dyspraxia of speech (CAS) is characterised by significant reductions in speech intelligibility. Speech intelligibility may be one of the criteria used to determine how functional a child’s speech is in their community. If most people they interact with them cannot understand them, then it would be considered a significant speech impairment.


Intelligibility Assessment

Intelligibility is a perceptual judgment that is based on how much of the child's spontaneous speech the listener understands. Intelligibility can vary along a continuum ranging from intelligible (message is completely understood) to unintelligible (message is not understood; Bernthal et al., 2017). Intelligibility is frequently used when judging the severity of the child's speech problem (Kent, Miolo, & Bloedel, 1994; Shriberg & Kwiatkowski, 1982b) and can be used to determine the need for intervention.

Intelligibility can vary depending on a number of factors, including

Rating scales and other estimates that are based on perceptual judgments are commonly used to assess intelligibility. For example, rating scales sometimes use numerical ratings like 1 for totally intelligible and 10 for unintelligible, or they use descriptors like not at all, seldom, sometimes, most of the time, or always to indicated how well speech is understood (Ertmer, 2010).

A number of quantitative measures also have been proposed, including calculating the percentage of words understood in conversational speech (e.g., Flipsen, 2006; Shriberg & Kwiatkowski, 1980). See also Kent et al. (1994) for a comprehensive review of procedures for assessing intelligibility.

Coplan and Gleason (1988) developed a standardized intelligibility screener using parent estimates of how intelligible their child sounded to others. On the basis of the data, expected intelligibility cutoff values for typically developing children were as follows:

22 months—50%

37 months—75%

47 months—100%

See the Resources section for resources related to assessing intelligibility and life participation in monolingual children who speak English and in monolingual children who speak languages other than English.


Calculating overall intelligibility is necessary when considering the need for treatment, identifying factors that contribute to poor intelligibility, selecting treatment goals, recording baseline information and monitoring the effect of treatment over time.

As the assessment of client’s speech and language sample progress an equal importance should be given for realizing the factors that can negatively influence intelligibility.


They include:

The number of sound errors:

A major factor influencing speech sound intelligibility is the number and nature of speech sound errors a speaker makes. The larger the number of speaker’s production which differs from the adult standard, the more the intelligibility is reduced.

·        The type of sound errors:

·        Inconsistency of errors:

Intelligibility is also affected by consistency of misarticulated sounds and the frequency with which an error sound occurs in the language. The more consistently the target sound is produced in the error and the more frequently the target sound occurs in the language, the more likely the listener will perceive the speaker’s speech as defective.

·      Vowel errors:

·      The rate of speech:

·      Atypical prosodic characteristic of speech such as abnormal intonation or stress:

·      The length and linguistic complexity of the words and utterances used:

·      Insufficient vocal intensity:

·      Dysfluencies, particularly severe dysfluencies that disrupts the context:

·      The lack of gestures or other paralinguistic cues that assist understanding:

·      The testing environment:

·      The client’s anxiety about the testing situation:

·      The client’s lack of familiarity with stimulus materials:

·      The client’s level of fatigue:

·      The clinician’s ability to understand less intelligible speech:

·      The clinician’s familiarity with the client and the client’s speaking context.

Intelligibility rating scale:

There are three rating scales for intelligibility of speech, 7 point rating scale, 3 point rating scale and 5 point rating scale.


7 point rating scale (Fudala, 1970) :

The 7 point rating scale consists of the following

·      Speech not intelligible

·      Speech usually not intelligible

·      Speech difficult to understand

·      Speech intelligible with careful listening

·      Speech intelligible although noticeable in error

·      Speech intelligible with occasional error

·      Speech totally intelligible

3 point rating scale (Bleile, 1995):

The 3 point rating scale is assessed by administering the following

·      Readily intelligible

·      Intelligible if topic is known

·      Unintelligible even with careful listening

5 point rating scale:

·      Completely intelligible

·      Mostly intelligible

·      Somewhat intelligible

·      Mostly unintelligible

·      Completely unintelligible