THE HUMAN
|
1
|
1.1 INTRODUCTION
Three components of this system: input–output, memory and
processing.
In the human, we are dealing with an
intelligent information-processing system, and processing therefore includes
problem solving, learning, and, con-sequently, making mistakes.
1.2 INPUT– OUTPUT CHANNELS
·
A
person’s interaction with the outside world occurs through information being
received and sent: input and output. In an interaction with a computer
the user receives information that is output by the computer, and responds by
providing input to the computer – the user’s output becomes the computer’s
input and vice versa.
- Consequently the use of the terms input and output may lead to confusion so we shall blur the distinction somewhat and concentrate on the channels involved. This blurring is appropriate since, although a particular channel may have a primary role as input or output in the interaction, it is more than likely that it is also used in the other role.
For example, sight may be used
primarily in receiving information from the computer, but it can also be used
to provide information to the computer, for example by fixating on a particular
screen point when using an eyegaze system.
- Input in the human occurs mainly through the senses and output through the motor control of the effectors. There are five major senses:
1.sight, 2.hearing, 3.touch, 4.taste
and 5.smell.
Of these, the first three are the most important to HCI.
Taste and smell do not currently play a significant role in HCI, and it is not
clear whether they could be exploited at all in general computer systems,
although they could have a role to play in more specialized systems (smells to
give warning of malfunction, for example) or in augmented reality systems.
However, vision, hearing and touch are central.
Similarly there are a number of
effectors, including the limbs, fingers, eyes, head and vocal system. In the
interaction with the computer, the fingers play the primary role, through
typing or mouse control, with some use of voice, and eye, head and body
position.
Imagine using a personal computer
(PC) with a mouse and a keyboard. The application you are using has a graphical
interface, with menus, icons and windows. In your interaction with this system
you receive information primarily by sight, from what appears on the screen.
However, you may also receive information by ear: for example, the computer may
‘beep’ at you if you make a mistake or to draw attention to something, or there
may be a voice commentary in a multimedia presentation. Touch plays a part too
in that you will feel the keys moving (also hearing the ‘click’) or the
orientation of the mouse, which provides vital feedback about what you have
done. You yourself send information to the computer using your hands, either by
hitting keys or moving the mouse. Sight and hearing do not play a direct role
in sending information in this example, although they may be used to receive information from a third source (for example, a book, or the
words of another per-son) which is then transmitted to the computer.
1.2.1 Vision
Human vision is a highly complex activity with a range of
physical and perceptual limitations, yet it is the primary source of
information for the average person.
We can roughly divide visual perception into two stages:
- The physical reception of the stimulus from the outside world, and
- The processing and interpretation of that stimulus.
On the one hand the physical
properties of the eye and the visual system mean that there are certain things
that cannot be seen by the human; on the other the interpretative capabilities
of visual processing allow images to be constructed from incomplete
information.
The human eye
Vision begins with light. The eye is
a mechanism for receiving light and transforming it into electrical energy.
Light is reflected from objects in the world and their image is focused upside
down on the back of the eye. The receptors in the eye transform it into
electrical signals which are passed to the brain.
The eye has a number of important
components (see Figure 1.1) which we will look at in more detail. The cornea
and lens at the front of the eye
focus the light into a sharp image on the back of the eye, the retina.
The retina is light sensitive and contains two types of photoreceptor: rods and cones.
Rods
are highly sensitive to light and therefore allow us to see under a low level
of illumination. However, they are unable to resolve fine detail and are
subject to light saturation. This is the reason for the temporary blindness we
get when moving from a darkened room into sunlight: the rods have been active
and are saturated by the sudden light. The cones do not operate either as they
are suppressed by the rods. We are therefore temporarily unable to see at all.
There are approximately 120 million rods per eye which are mainly situated
towards the edges of the retina. Rods there-fore dominate peripheral vision.
Cones
are the second type of receptor in the eye. They are less sensitive to light
than the rods and can therefore tolerate more light. There are three types of
cone, each sensitive to a different wavelength of light. This allows color
vision. The eye has approximately 6 million cones, mainly concentrated on the fovea, a small area of the retina on
which images are fixated.
Although the retina is mainly
covered with photoreceptors there is one blind spot where the optic nerve
enters the eye. The blind spot has no rods or cones, yet our visual system
compensates for this so that in normal circumstances we are unaware of it.
The retina also has specialized
nerve cells called ganglion cells. There are two types: X-cells, which are
concentrated in the fovea and are responsible for the early detec-tion of
pattern; and Y-cells which are more widely distributed in the retina and are
responsible for the early detection of movement. The distribution of these cells
means that, while we may not be able to detect changes in pattern in peripheral
vision, we can perceive movement.
Visual perception
The information
received by the visual apparatus must be filtered and passed to processing
elements which allow us to recognize coherent scenes, disambiguate relative
distances and differentiate color.
Perceiving size and depth
Imagine you are standing on a hilltop. Beside you on the summit
you can see rocks, sheep and a small tree. On the hillside is a farmhouse with
outbuildings and farm vehicles. Someone is on the track, walking toward the
summit. Below in the valley is a small market town.
Even in describing such a scene the
notions of size and distance predominate. Our visual system is easily able to
interpret the images which it receives to take account of these things. We can
identify similar objects regardless of the fact that they appear to us to be of
vastly different sizes. In fact, we can use this information to judge
distances.
So how does the eye perceive size, depth and relative
distances? To understand this we must consider how the image appears on the
retina. As we noted in the previous section, reflected light from the object
forms an upside-down image on the retina. The size of that image is specified
as a visual angle. Figure 1.2
illustrates how the visual angle is calculated.
If we were to draw a line from the
top of the object to a central point on the front of the eye and a second line
from the bottom of the object to the same point, the visual angle of the object
is the angle between these two lines. Visual angle is affected by both the size
of the object and its distance from the eye. Therefore if two objects are at
the same distance, the larger one will have the larger visual angle. Similarly,
if two objects of the same size are placed at different distances from the eye,
the furthest one will have the smaller visual angle
The visual angle indicates how much
of the field of view is taken by the object. The visual angle measurement is
given in either degrees or minutes of arc, where 1 degree is
equivalent to 60 minutes of arc, and 1 minute of arc to 60 seconds of arc.
So how does an object’s visual angle
affect our perception of its size? First, if the visual angle of an object is
too small we will be unable to perceive it at all. Visual acuity is the ability of a person to perceive fine detail. A
number of measurements have been
established to test visual acuity, most of which are included in standard eye
tests. For example, a person with normal vision can detect a single line if it
has a visual angle of 0.5 seconds of arc. Spaces between lines can be detected
at 30 seconds to 1 minute of visual arc. These represent the limits of human
visual acuity.
Given that the visual angle of an
object is reduced as it gets further away, we might expect that we would
perceive the object as smaller. In fact, our perception of an object’s size
remains constant even if its visual angle changes. So a person’s height is
perceived as constant even if they move further from you. This is the law of size constancy, and it indicates
that our perception of size relies on factors other than the visual angle.
One of these factors is our
perception of depth. If we return to the hilltop scene there are a number of cues which we can use to determine the
relative positions and distances of the objects which we see. If objects
overlap, the object which is partially covered is perceived to be in the
background, and therefore further away. Similarly, the size and height of the
object in our field of view provides a cue to its distance.
A third cue is familiarity: if we
expect an object to be of a certain size then we can judge its distance
accordingly. This has been exploited for humour in advertising: one
advertisement for beer shows a man walking away from a bottle in the
fore-ground. As he walks, he bumps into the bottle, which is in fact a giant
one in the background!
Perceiving brightness
A second aspect of visual perception is the perception of
brightness. Brightness is in fact a subjective reaction to levels of
light. It is affected by luminance which is the amount of
light emitted by an object. The luminance of an object is dependent on the amount of light falling on the object’s
surface and its reflective properties. Luminance is a physical characteristic
and can be measured using a photometer. Contrast is related to
luminance: it is a function of the luminance of an object and the luminance of
its background.
Although brightness is a subjective
response, it can be described in terms of the amount of luminance that gives a just
noticeable difference in brightness. However, the visual system itself
also compensates for changes in brightness. In dim lighting, the rods
predominate vision. Since there are fewer rods on the fovea, objects in low
lighting can be seen less easily when fixated upon, and are more visible in
peripheral vision. In normal lighting, the cones take over.
Visual acuity increases with
increase luminance. This may be an argument for using high display luminance.
However, as luminance increases, flicker also increases. The eye will
perceive a light switched on and off rapidly as constantly on. But if the speed
of switching is less than 50 Hz then the light is perceived to flicker. In high
luminance flicker can be perceived at over 50 Hz. Flicker is also more noticeable
in peripheral vision. This means that the larger the display (and consequently
the more peripheral vision that it occupies), the more it will appear to
flicker.
Perceiving color
A third factor that we need to consider is perception of
color. Color is usually regarded as being made up of three
components: hue, intensity and saturation. Hue is determined by the
spectral wavelength of the light. Blues have short wavelengths, greens medium and reds long. Approximately 150
different hues can be discriminated by the average person. Intensity is the
brightness of the color, and saturation is the amount of whiteness in the
color. By varying these two, we can perceive in the region of 7 million
different colors. However, the number of colors that can be identified by an
individual without training is far fewer (in the region of 10).
The eye perceives color because the
cones are sensitive to light of different wave-lengths. There are three
different types of cone, each sensitive to a different color (blue, green and
red). Color vision is best in the fovea, and worst at the periphery where rods
predominate. It should also be noted that only 3 – 4% of the fovea is occupied
by cones which are sensitive to blue light, making blue acuity lower.
Finally, we should remember that
around 8% of males and 1% of females suffer from color blindness, most commonly
being unable to discriminate between red and green.
The capabilities and limitations of visual processing
In considering the way in which we perceive images we have
already encountered some of the capabilities and limitations of the human
visual processin g system. However, we have concentrated largely on low-level
perception. Visual processing involves the transformation and interpretation of
a complete image, from the light that is thrown onto the retina. As we have
already noted, our expectations affect the way an image is perceived. For
example, if we know that an object is a particular size, we will perceive it as
that size no matter how far it is from us.
Visual processing compensates for
the movement of the image on the retina which occurs as we move around and as
the object which we see moves. Although the retinal image is moving, the image
that we perceive is stable. Similarly, color and brightness of objects are
perceived as constant, in spite of changes in luminance.
his ability to interpret and
exploit our expectations can be used to resolve ambi-guity. For example,
consider the image shown in Figure 1.3. What do you perceive? Now consider
Figure 1.4 and Figure 1.5. The context in which the object appears
allows our expectations to clearly disambiguate the
interpretation of the object, as either a B or a 13.
However, it can also create optical
illusions. For example, consider Figure 1.6. Which line is longer? Most people
when presented with this will say that the top line is longer than the bottom.
In fact, the two lines are the same length. This may be due to a false
application of the law of size constancy: the top line appears like a con-cave
edge, the bottom like a convex edge. The former therefore seems further away
than the latter and is therefore scaled to appear larger. A similar illusion is
the Ponzo illusion (Figure 1.7). Here the top line appears longer, owing to the
distance effect, although both lines are the same length. These illusions
demonstrate that our per-ception of size is not completely reliable.
Reading
There are several stages in the reading process. First, the visual pattern of the word on the page
is perceived. It is then decoded with
reference to an internal representation of language. The final stages of
language processing include syntactic
and semantic analysis and operate on phrases or sentences.
We are most concerned with the first
two stages of this process and how they influence interface design. During
reading, the eye makes jerky movements called saccades followed by
fixations. Perception occurs during the fixation periods, which account for approximately 94% of the
time elapsed. The eye moves backwards over the text as well as forwards, in
what are known as regressions. If the text is complex there will be more
regressions.
Adults read approximately 250 words
a minute. It is unlikely that words are scanned serially, character by
character, since experiments have shown that words can be recognized as quickly
as single characters. Instead, familiar words are recognized using word shape.
This means that removing the word shape clues (for example, by capitalizing
words) is detrimental to reading speed and accuracy.
The speed at which text can be read
is a measure of its legibility. Experiments have shown that standard font sizes
of 9 to 12 points are equally legible, given pro-portional spacing between
lines. Similarly line lengths of between 2.3 and 5.2 inches (58 and 132 mm) are
equally legible. However, there is evidence that reading from a computer screen
is slower than from a book . This is thought to be due to a number of factors
including a longer line length, fewer words to a page,orientation
and the familiarity of the medium of the page. These factors can of course be
reduced by careful design of textual interfaces.
A final word about the use of
contrast in visual display: a negative contrast (dark characters on a light
screen) provides higher luminance and, therefore, increased acuity, than a
positive contrast. This will in turn increase legibility. However, it will also
be more prone to flicker. Experimental evidence suggests that in practice
negat-ive contrast displays are preferred and result in more accurate
performance.
1.2.2 Hearing
The sense of hearing is often considered secondary to sight,
but we tend to under-estimate the amount of information that we receive through
our ears. Close your eyes for a moment and listen. What sounds can you hear?
Where are they coming from? What is making them? As I sit at my desk I can hear
cars passing on the road outside, machinery working on a site nearby, the drone
of a plane overhead and bird song. But I can also tell where the sounds are coming from, and estimate how far away they
are. So from the sounds I hear I can tell that a car is passing on a particular
road near my house, and which direction it is traveling in. I know that
building work is in progress in a particular location, and that a certain type
of bird is perched in the tree in my garden.
The auditory system can convey a lot
of information about our environment. But how does it work?
The human ear
Just as vision begins with light, hearing begins with
vibrations in the air or sound waves. The ear receives these vibrations
and transmits them, through various stages,
to the auditory nerves. The ear comprises three sections, commonly known as
the outer ear, middle ear and inner ear.
The outer ear is the visible part of
the ear. It has two parts: the pinna,
which is the structure that is attached to the sides of the head, and the auditory canal, along which sound waves
are passed to the middle ear. The outer ear serves two purposes. First, it
protects the sensitive middle ear from damage. The auditory canal contains wax
which prevents dust, dirt and over-inquisitive insects reaching the middle ear.
It also maintains the middle ear at a constant temperature. Secondly, the pinna
and auditory canal serve to amplify some sounds.
The middle ear is a small cavity
connected to the outer ear by the tympanic
membrane, or ear drum, and to the
inner ear by the cochlea. Within the
cavity are the ossicles, the smallest
bones in the body. Sound waves pass along the auditory canal and vibrate the ear drum which in turn
vibrates the ossicles, which transmit the vibrations to the cochlea, and so
into the inner ear. This ‘relay’ is required because, unlike the air-filled
outer and middle ears, the inner ear is filled with a denser cochlean liquid.
If passed directly from the air to the liquid, the transmission of the sound
waves would be poor. By transmitting them via the ossicles the sound waves are
concentrated and amplified.
The waves are passed into the
liquid-filled cochlea in the inner ear. Within the cochlea are delicate hair
cells or cilia that bend because of
the vibrations in the cochlean liquid and release a chemical transmitter which
causes impulses in the auditory nerve.
Processing sound
As we have seen, sound is changes or vibrations in air
pressure. It has a number of characteristics which we can differentiate. Pitch is the frequency of the sound. A
low frequency produces a low pitch, a high frequency, a high pitch. Loudness is propor-tional to the
amplitude of the sound; the frequency remains constant. Timbre relates to the type of the sound: sounds may have the same
pitch and loudness but be made by different instruments and so vary in timbre.
We can also identify a sound’s loca-tion, since the two ears receive slightly
different sounds, owing to the time difference between the sound reaching the
two ears and the reduction in intensity caused by the sound waves reflecting
from the head.
The human ear can hear frequencies
from about 20 Hz to 15 kHz. It can distin-guish frequency changes of less than
1.5 Hz at low frequencies but is less accurate at high frequencies. Different
frequencies trigger activity in neurons in different parts of the auditory
system, and cause different rates of firing of nerve impulses.
The auditory system performs some
filtering of the sounds received, allowing us to ignore background noise and
concentrate on important information. We are selective in our hearing, as
illustrated by the cocktail party effect,
where we can pick out our name spoken across a crowded noisy room. However, if
sounds are too loud, or frequencies too similar, we are unable to differentiate
sound.
As we have seen, sound can convey a
remarkable amount of information. It is rarely used to its potential in
interface design, usually being confined to warning sounds and notifications.
The exception is multimedia, which may include music, voice commentary and
sound effects. However, the ear can differentiate quite subtle sound changes
and can recognize familiar sounds without concentrating attention on the sound
source. This suggests that sound could be used more extensively in interface
design, to convey information about the system state, for example. This is
discussed in more detail in Chapter 10.
1.2.3 Touch
The third and last of the senses that we will consider is
touch or haptic perception. Although this sense is often viewed as less
important than sight or hearing, imagine life without it. Touch provides us
with vital information about our environment. It tells us when we touch
something hot or cold, and can therefore act as a warning. It also provides us
with feedback when we attempt to lift an object, for example. Consider the act
of picking up a glass of water. If we could only see the glass and not feel
when our hand made contact with it or feel its shape, the speed and accuracy of
the action would be reduced. This is the experience of users of certain virtual
reality games: they can see the computer-generated objects which they
need to manipulate but they have no physical sensation of touching them.
Watching such users can be an informative and amusing experience! Touch is
therefore an important means of feedback, and this is no less so in using
computer systems. Feeling buttons depress is an important part of the task of
pressing the button. Also, we should be aware that, although for the average
person, haptic perception is a secondary source of informa-tion, for those
whose other senses are impaired, it may be vitally important. For such users,
interfaces such as braille may be the primary source of information in the
interaction. We should not therefore underestimate the importance of touch.
The apparatus of touch differs from
that of sight and hearing in that it is not local-ized. We receive stimuli
through the skin.
The skin contains three types of
sensory receptor:
1.
thermoreceptors respond
to heat and cold,
2.
nociceptors respond to intense
pressure, heat and pain, and
3.
mechanoreceptors
respond to pressure.
It is the last of these that we are
concerned with in relation to human–computer interaction.
There are two kinds of
mechanoreceptor, which respond to different types of pressure.
1.
Rapidly adapting mechanoreceptors respond to
immediate pressure as the skin is indented. These receptors also react more
quickly with increased pressure. However, they stop responding if continuous
pressure is applied.
2.
Slowly adapting mechanoreceptors
respond to continuously applied pressure.
Although the whole of the body
contains such receptors, some areas have greater sensitivity or acuity than
others. It is possible to measure the acuity of different areas of the body
using the two-point threshold test.
Take two pencils, held so their tips are about 12 mm apart. Touch the points to
your thumb and see if you can feel two points. If you cannot, move the points a
little further apart. When you can feel two points, measure the distance
between them.The greater the distance, the lower the sensitivity. You can
repeat this test on different parts of your body. You should find that the measure on the forearm is around 10 times that of the
finger or thumb. The fingers and thumbs have the highest acuity.
A second aspect of haptic perception
is kinesthesis:
awareness of the position of the body and limbs. This is due to receptors in
the joints. Again there are three types: rapidly adapting, which respond when a
limb is moved in a particular direc-tion; slowly adapting, which respond to
both movement and static position; and positional receptors, which only respond
when a limb is in a static position. This perception affects both comfort and
performance. For example, for a touch typist, awareness of the relative
positions of the fingers and feedback from the keyboard are very important.
1.2.4 Movement
Before leaving this section on the
human’s input–output channels, we need to consider motor control and how the
way we move affects our interaction with com-puters. A simple action such as
hitting a button in response to a question involves a number of processing
stages. The stimulus (of the question) is received through the sensory
receptors and transmitted to the brain. The question is processed and a valid
response generated. The brain then tells the appropriate muscles to respond.
Each of these stages takes time, which can be roughly divided into reaction
time and movement time.
Movement time is dependent largely
on the physical characteristics of the subjects: their age and fitness, for
example. Reaction time varies according to the sensory channel through which
the stimulus is received. A person can react to an auditory signal
in approximately 150 ms, to a visual signal in 200 ms and to pain in 700 ms.
However, a combined signal will result in the quickest response. Factors such
as skill or practice can reduce reaction time, and fatigue can increase it.
A second measure of motor skill is
accuracy. One question that we should ask is whether speed of reaction results
in reduced accuracy. This is dependent on the task and the user. In some cases,
requiring increased reaction time reduces accuracy. This is the premise behind
many arcade and video games where less skilled users fail at levels of play
that require faster responses. However, for skilled operators this is not
necessarily the case. Studies of keyboard operators have shown that, although
the faster operators were up to twice as fast as the others, the slower ones
made 10 times the errors.
Speed and accuracy of movement are
important considerations in the design of interactive systems, primarily in
terms of the time taken to move to a particular target on a screen. The target
may be a button, a menu item or an icon, for example. The time taken to hit a
target is a function of the size of the target and the distance that has to be
moved. This is formalized in Fitts’ law
. There are many vari-ations of this formula, which have varying constants, but
they are all very similar. One common form is
Movement
time = a + b log2(distance/size + 1)
where
a and b are empirically determined constants.
This affects the type of target we
design. Since users will find it more difficult to manipulate small objects,
targets should generally be as large as possible and the distance to be moved
as small as possible. This has led to suggestions that pie-chart-shaped menus
are preferable to lists since all options are equidistant. However, the
trade-off is increased use of screen estate, so the choice may not be so
simple.

No comments:
Post a Comment