CHAPTER III.
HCI - HUMAN-COMPUTER INTERACTION
AND USER INTERFACE

In this chapter we look at the interaction between human user and the 
computer or other information system as such, mostly considering the 
user interface (UI).  We will look at each aspect in turn.

1.  OUR APPROACH TO UNDERSTANDING HCI

The danger in understanding our interaction with the computer (HCI 
and UI) is that we focus on one or two aspects to the detriment of 
others.  For example, many websites look great (aesthetic aspect) but 
are so badly structured (formative aspect) that you just cannot get the 
information you want.  Have you ever got to the a place in a shopping 
website where it gives you a message and you don't know what to do?

      To overcome this danger, we look at each and every aspect of the
interaction between human and computer, the HCI.  Doing so helps 
us recall all the various types of things that are important in successful 
human interaction with computers and other IT such as mobile 
phones, whether this be in user interfaces or multimedia.

1.1  Overview of Aspects of HCI

      In the Human Experience chapter we gave an overview of aspects
of the UI or HCI:

 ť   Quantitative aspect:  Amount and number of interactions and
      devices.
 ť   Spatial aspect:  Spatial arrangements, location and size.
 ť   Kinematic aspect:  Movement
 ť   Physical aspect:  How both the UI and our bodies engage
      physically: forces, friction, light, vibration, etc.
 ť   Organic (biotic) aspect:  How the user interface matches the our
      organs like eyes, ears, hands, and whether it affects our health.
 ť   Psychic aspect:  Seeing colours, shapes, movement etc. on
      screen, hearing sounds, feeling vibration etc., controlling mouse,
      keyboard, etc.
 ť   Analytic aspect:  Identifying that the shapes and sounds are
      expressing concepts, and what type they are.
 ť   Formative aspect:  The structure of this information.
 ť   Lingual aspect:  What the information means, its content.
 ť   Social aspect:  The cultural connotations and acceptability of the
      information.
 ť   Economic aspect:  The limited resources of the UI and HCI.
 ť   Aesthetic aspect:  The design style of the UI and HCI: visual,
      aural and haptic and how they harmonise; 'nice' touches.
 ť   Juridical aspect:  How well the UI does justice to the users or the
      information meaning.
 ť   Ethical aspect:  The 'generosity' (or otherwise) of the UI.
 ť   Faith aspect:  What is the deep motivation behind the UI?

      We will look at each aspect of HCI in detail in turn in this
chapter.

1.2  Input and Output:  Model, View, Controller

From several aspects, our interaction with IT may be seen as input 
and output.  Input is where we give information to the computer or do 
something that it responds to, and output is where it gives information 
to us or it does something to which we respond.  Examples of input 
are where we double-click on an application, type text, operate a 
slider with the mouse to increase the volume of music, or thumb 
across our mobile phone screen to get to the next photo.  Examples of 
output that the computer might give in response to that include:  the 
window of the application appears, the words we type come up on 
screen and an indication is given on spelling error, the volume of 
music increases, and the next photo slides into view.

      It is traditional to call the devices of the computer and its
software that accept the input the 'Controller', and those that provide 
the ouptut the 'View'.  Behind these, 'inside' the computer, is the 
'Model', which contains all the information received from the 
controller and from elsewhere, and which is expressed in the View.

      Input and output, controller and view, differ, and the
characteristic of each suit the capabilities of both computer and 
human.

 ť   Input from human to computer (via the Controller) tends to be
      slow and simple - an information rate of a few tens of pieces of
      information per second.  This suits the human because our ability
      to send messages to the computer is limited, and it suits the
      computer since its ability to recognise what the user wants is
      limited.

 ť   Output from computer to human (via the View) is fast and
      complex, a screenful of information (hundreds of bits of
      information) can be given several times a second (such a in a
      fast-moving game).  This suits the human since we can recognise
      and collate information, especially via our eyes, very fast, and it
      suits the computer, which can display or emit information very
      fast.  The View need not be a single window, but several.  In
      fact,t he View need not be just a screen, but can involve also
      sound output (loudspeakers) and other channels.

      The Model is the store of information 'in' the computer, and the
View shows some of this information.  The Model usually contains 
more information than is shown in the View.  The Controller 
modifies some of the information held in the Model.

 ť   For example, the Model might be a database of information about
      medical patients, and the View might be showing some of the
      data in one patient's record.  Suppose the user issues a 'Delete'
      command to remove the record for the patient whose information
      is displayed in the View.  The record is deleted from the
      database.  Then the View is updated to show that the record has
      been deleted (for example, to show some information from
      another record, or to show a message saying "Record has been
      deleted").

 ť   Suppose you are browsing a web page.  You click on a
      hyperlink.  The Controller works out that it is a request to find
      and display another page, and sends instructions to the Model
      (which happens to be elsewhere on the World Wide Web) to find
      that page.  The page arrives, and the View is updated to show
      this page.

Note that the Model need not be in the user's computer: it could be 
distant on the Internet.

      Problems with input - such as hitting the wrong key, or keys
getting stuck, or giving the wrong command - are different from 
problems with output - such as misunderstanding what is on the 
screen, or not being able to see it properly.

      The difference between input and output occurs with the biotic to
formative aspects.  In other aspects, input and output are merged into 
our overall interaction.  In most aspects we will discuss not only what 
occurs in our interaction as seen from that aspect, but also what 
challenges and problems there might be for user interfaces and 
multimedia.  We will take the aspects in three groups:

 ť   Pre-lingual aspects, which support the lingual
 ť   The lingual aspect, which is the most important aspect of HCI
      and forms a link with ERM
 ť   The post-lingual aspects, which affect the style and success of the
      HCI.


2.  PRE-LINGUAL ASPECTS OF HCI

Now we go through the aspects of HCI in detail.  We are using the 
aspects not so much as a categories, but rather as a way of separating 
out the issues that are important in HCI and UI.  Many discussions of 
the issues in HCI focus only on certain aspects and forget others.

      We will cover some aspects in more detail than others.  We will
look at technologies and techniques in each aspect.  We will also look 
at quality criteria in some aspects (what makes a UI or HCI good or 
bad in that aspect) and various kinds of error that might afflict use of 
computers; each kind is usually explainable in one aspect.

2.1  Number of Things to Interact With
(Quantitative Aspect of HCI)

This concerns amounts and counts of things.

      For example, how many windows are open?  Some applications
open several windows, to show several different things.  Example: 
MSWord has the main document window and a window showing 
styles.  Example: the Imagine 3D virtual reality creator has four 
windows, showing four views of the scene being created.  So the 
quantitative aspect here refers to the number of windows.  But when 
we click the mouse or press a key on the keyboard, that should 
usually go to only one window.  So the quantitative aspect here refers 
to one (1).

      But why count things?  Usually for some other reason, relating to
another aspect.  By itself, the quantitative aspect of HCI (e.g. counts 
of things in the interaction) has little meaning.  Rather, the counting 
of things is usually a prelude to considering another aspect.  For 
example, Miller [1955] published a paper about the number of things 
users can keep in mind at one time ('The Magic number 7, plus or 
minus 2').  It is the number of 'chunks' of things that are on screen 
which the user is expected to be aware of; we will meet Miller again 
in the analytic aspect.

2.2  Screen Layout
(Spatial Aspect of HCI)

The spatial aspect of HCI is particularly important on screens.  For 
example:

 ť   The layout of the screen: where things are, and where we expect
      them to be, for example navigation links on a website are
      collected together in one place.
 ť   The shape of things on screen.  Usually rectangles for pictures.
      But also the shape of icons can help us recognise and locate them
      quickly.
 ť   Spatial relationships and arrangements on screen.  For example,
      we expect that things that line up above each other have
      something in common, such as in a list or table.

      The importance of these is related to other aspects, as will be
explained later.

      The mouse is an excellent at functioning spatially.  The mouse
pointer indicates an exact position.

3D and 2D Space:

      Think about a virtual reality scene (e.g. in a 3D computer game)
on your screen.  There are two spatial aspects here: the position and 
shapes on the screen itself, which are all in two dimensions, and the 
scene, which is in three dimensions.  These are both spatial aspects, 
but one is HCI and the other is ERM.

 ť   The two-dimensional space on screen itself is HCI.
 ť   The three-dimensional space of the virtual scene is ERM, because
      it is what the information is about, namely 3D space.

      However, in a two-dimensional game or a map, both HCI and
ERM are two-dimensional - which can sometimes lead to confusion.

      The spatial aspect is very important in HCI, but we usually need
the visual psychic channel to see it; see below.

2.3  Animation
(Kinematic Aspect of HCI)

The kinematic functioning at the UI is particularly evident in 
animation.  Our visual psychic channel is particularly sensitive to 
movement, so we tend to notice it.  So movement is often used to 
attract (distract!) attention.  However, that is a matter for the psychic 
aspect below.  The actual kinematic aspect is concerned with 
movement itself, such as:

 ť   Movement of mouse and mouse pointer.

 ť   Movement of objects across a screen (e.g. the piece of paper on
      the Microsoft copying facility).
 ť   Movement of our view across a landscape, e.g. as though we are
      flying across it.
 ť   Flowing movement, e.g. of fluids in pipes.

 ť   Non-visual movement includes:  In music there is movement
      through the piece from beginning to end.  In a document there is
      movement from beginning to end for the reader.

Some of these will be picked up again in the psychic functioning in 
the UI below.

      The kinematic aspect is relevant to HCI in at least three ways.  It
can be used during visual output as animation to attract attention (such 
as those annoying advertisements!).  It can be used as decoration, to 
make the visual interface more 'lively'.

      But more important than either of these is the use of movement to
let the user know what is going on.  For example in user interfaces 
during the past few years, windows on screen have not just appeared, 
but have moved into view, expanding from where the user clicked the 
mouse, or moving out of view, such as quickly shrinking down to a 
small icon.  Such movement provides a subliminal information to the 
user about what the computer or mobile phone is doing, and this 
provides comfort.

2.4  Hardware Materials: Physical Functioning 

This is the aspect of what is often called the basic technology.  It 
concerns, for example
 ť   materials,
 ť   electricity,
 ť   magnetism,
 ť   light,
 ť   vibration,
 ť   shocks
 ť   and the like.

      For example, your visual output might involve the physics of
electron beams travelling through a vacuum tube to hit phosphor dots 
on a glass surface, causing them to emit light: the cathode ray tube 
used in most screens up until a few years ago.  LCD screens, such as 
in a mobile phone, operate by other physical principles, specifically 
altering the orientation of complex crystals (LCD: liquid crystal 
display) so that they let light pass, or not.  CRT and LCD have this 
one thing in common: all of them produce colours by means of a 
triple of three light-emitting dots, red, green and blue; by giving out 
different amounts of these three colours, almost all possible colours 
can be generated.

      Loudspeakers work by converting electrical signals into
vibrations in the air, either by electro-magnetics or piezoelectric 
effects (applying an electric field to certain crystals causes them to 
shrink).

      Only seldom do we need to actively consider the physical aspect
of HCI, because in normal circumstances, the physics works so well 
and reliably that we can take it for granted.  Giving attention to the 
physical aspect is useful for at least two reasons.  One is that it 
enables us to understand how things work, so we can perhaps 
understand better what is required, such as ruggedized equipment, or 
equipment that must work in exceptional physical conditions such as 
in space.

      The other important reason why we should be aware of the
physical aspect of HCI is when things go wrong.  When things go 
wrong, it is useful to know why they might have gone wrong, and 
how to prevent things going wrong in the future.  Here are some 
examples:

 ť   Power cuts!
 ť   When your mouse ball is on a slippery surface it works
      unreliably.
 ť   Jam sandwiches have made your children's hands sticky just
      before they use the computer, so the mouse and keys end up all
      sticky.
 ť   When the internal mechanism of some keys or buttons is worn or
      springs become weak, then they don't work reliably.
 ť   Coffee spilled on the keyboard is not good for it!
 ť   Heat melts the case of your mouse or keyboard, distorting it.

 ť   If your data is stored on magnetic disk (e.g. floppy disk), then
      the data can be lost if a magnet get near it.
 ť   Bending a CD or DVD destroys it.
 ť   Physical shock, such as dropping your computer or mobile
      phone, can make it go wrong.
 ť   Overheating can make it malfunction, so do not block air vents.
 ť   Lightning strike can destroy the electronics of your computer.
 ť   Fire can burn it all.

The moral of all this is:  ensure you keep good backups, and take care 
of your equipment.


2.5  Hardware 2:  Devices Matching Bodily 
Characteristics (Biotic-Organic Aspect of HCI)

In HCI, the biotic/organic aspect is concerned with the actual 
hardware devices that engage with our sense organs - eyes, ears, 
hands etc. - regardless of what physics they employ.  In Figure 1 we 
have:

 ť   The ears and the loudspeaker,
 ť   The eyes and the screen,
 ť   The hand or fingers and the mouse, keyboard or touch-pad,
 ť   The mouth and vocal organs and the microphone.

      {*** Example:  Think about your mobile phone, and how well or
      badly it fits your hand, fingers, and the distance between ear and
      mouth.  These issues are of the organic/biotic aspect.  ***}

      From this aspect it matters little what material the mouse is made
of; what matters is whether it fits the hand well: imagine a mouse the 
size of a desk: it would be unusable as a mouse!


                    Figure 1.  Computer with input and output hardware

      The organic/biotic aspect is also the realm of electronics (rather
than electricity).  Seen from this aspect the various devices work by 
electromotive force (EMF, measured in volts), currents (measured in 
amps) operating in conductors on components.  Much of this is digital 
electronics, in which the EMFs (voltages) are limited to two values 
such as 2.7v and 3.3v or 0v and 5v.  These represent the binary 
alternatives of on and off (or 1 and 0), when seen from 
psychic/sensitive aspect, later.  But in the UI devices, there is also 
some analog electronics, in which a continuous range of EMFs is 
operative.  For more on this, see books on computer electronics. 
Here we will look only at the larger-scale devices.  So Figure 1 
shows the electronics that serves these hardware devices (sound 
hardware to convert digital voltages into analog for the loudspeaker, 
display hardware to convert digital voltages into analog to drive the 
thousands of microscopic light-emitting devices that make up the 
screen, and analog-to-digital convertor for the microphone).  These 
are linked to the central processing unit (CPU) and memory of the 
computer by conductors (the bus is a multiple conductor).

2.5.1  The Organic Aspect of Model, View and Controller 

If we see the computer and its user interface in terms of model-view-
controller, then the model in this aspect is the innards of the 
computer, including the printed circuit boards of the main memory 
and central processing unit, the disks.  This is shown in Figure 1. 
Three different channels of output (view) hardware links with three 
different human organs:

 ť   Visual channel, relating to our eyes
 ť   Aural channel, relating to our ears
 ť   Haptic channel, relating to our body (often our hands).

The input (controller) hardware usually links with our hands and 
fingers, though there is also some sound input via microphones.

      The view (output) consists of the screen and the electronics
responsible for the visual display on screen, the loudspeakers and 
sound electronics, and force-actuators that and the electronics that 
controls them.

 ť   The visual display electronics consists of the display hardware
      (e.g. a graphics card), which accesses some of the the main
      computer memory and converts the digital electric charges it
      finds therein into analog voltages to drive tiny light-emitting
      cells.  These cells emit light that is either red, green or blue, and
      the intensity of the light is controlled by the EMF (voltage) fed to
      them.  The more the EMF, the brighter the light emitted.  As the
      EMF varies, so the amount of light varies.  The tiny light-
      emitters are grouped in triples (red, green, blue), and there are
      typically a million such triples arranged in an array in a modern
      visual display, and 200,000 in a mobile phone display.

 ť   The sound electronics consists of sound hardware (e.g. a sound
      card) that converts some of the electric charges found in the
      computer's main memory into a stream of analog current that is
      fed through the coils of loudspeakers.  This current is rapidly
      alternating, at frequencies usually of between 300 to 3000 times
      per second, and these make the coil and cone of loudspeakers
      vibrate at a frequency that is audible to the human ear.

 ť   The haptic force actuators press on our hands as vibration, or
      control movable seating (such as immersive cinema).  The
      electronics that controls this receives varying EMF from the
      central processing unit, and converts this into powerful currents
      sent through coils operating in magnetic fields which create
      movement.  (It is similar to loudspeakers but operating at lower
      frequencies.)

      (The purpose of these tiny triple light-emitters cannot by
understood from the point of view of the organic aspect, but can only 
be understood from the point of view of the psychic/sensory aspect. 
From that aspect we note that each different combination of red, 
green and blue light gives us the sensory experience of seeing a 
different colour.  Most colours apart from flesh tones can be 
faithfully composed in this way.  This illustrates how the 
biotic/organic aspect anticipates later aspects.)

      The controller (input) consists of the input devices like mouse,
keys and touch-sensitive screen, and the electronics responsible for 
linking these with the computer.

2.5.2  Input hardware devices 

We list input devices in approximate chronological order, older first 
(but sometimes still used).  We indicate in bold text what the user 
does with each, but strictly each of these is of the psychic aspect.

 ť   Switches and plug boards:  Early computers received their
      information by people setting switches or plug-boards (boards
      with lots of holes into which plugs were inserted).  These operate
      by means of making contact between circuits of the computer.
      Suitable for manufacturing process control, e.g. measuring
      temperature, pressure, liquid level.  Users throw switches or
      plug the board holes.

 ť   Paper tape reader: reads holes in punched paper tape by means of
      photoelectric cells.  Tape can be any length.  People have to
      punch tape ahead of it being read and feed them in to a card
      reader.  Suitable for batch, not interactive, input.

 ť   Card reader: reads holes in punched cards, similar to paper tape,
      but each card is 80 columns.  People have to punch the cards
      ahead of them being read and feed them in to a card reader.
      Suitable for batch, not interactive, input.

 ť   Swipe card reader:  modern version of punched card reader,
      where the information is held not by punched holes but in
      magnetic strips or in bar codes that are read by lasers.  The user
      either swupes the card or holds it for the laser to read (as in
      supermarket checkouts).

 ť   Keyboard: Human user presses or hits keys and these send
      electric pulses to computer.  Suitable for interactive input.
      Original version (1970s Teletype) was like an electric typewriter;
      today's versions have much more sensitive keys.

 ť   Joystick: User pushes a stick in one of 8 directions; this causes
      various switches to close and send electric pulses to computer, a
      different pattern of pulses for each direction.  Some 'analogue'
      joysticks also send pulses that indicate how far the stick is
      pushed.  Most joysticks also have a couple of keyboard-like
      switches that can be pressed or hit.

 ť   Mouse: User moves mouse, and this sends a stream of electric
      pulses to computer that indicate how the mouse moves in two
      directions.  (By this means, in its psychic functioning) the
      computer keeps track of where the mouse is.  Like the joystick,
      the mouse usually also has two keyboard-like switches that can be
      pressed or hit.

 ť   Trackball:  Like an upside-down mouse, with a ball which the
      user moves around in any direction; the trackball sends a stream
      of electric pulses to computer that indicates its movement in two
      directions.  One advantage over the mouse is greater precision for
      fine movement.

 ť   Touch-pad and touch-screen:  The user touches or strokes a pad
      (e.g. on laptop) or the screen and this detects where on pad or
      screen the finger was placed, and sends a stream of electric
      pulses to computer whose pattern indicates this position.  Used
      almost like a mouse.

 ť   Microphone input:  The user speaks, and sound waves detected
      by microphone are converted first to electric waveforms, which
      are then converted to streams of (digital) pulses that are sent to
      the computer.

 ť   Camera:  The user points the camera at a view.  Light from the
      view is focused by a lens on an array of light-sensitive cells.
      These (or the electronics connected to them) emit pulses which
      are sent to the computer; typically 10 million pulses per picture.

 ť   Direct connection to nerves:  Tiny electrodes are inserted in
      nerve cells and pick up their electrical activity.  When the user
      thinks these nerve cells might be activated, and this activation is
      converted to electric pulses that are sent to the computer.  This
      type of input device is still only experimental.

Thus all input devices except the first switches and plug-boards send 
electric pulses to the computer.  What the computer does with these 
cannot very easily be described from the point of view of the biotic 
(hardware) aspect, but makes sense only at the psychic aspect; see 
below.

2.5.3  Output Hardware Devices 

 ť   Screen:  An array of tiny pixels (e.g. 1280 by 1024) that emit
      light of various colours.  This can be:  phosphorescent dots in
      cathode ray tube, light-conductive diodes (LCD) that filter light
      to various colours, or plasma emitters.  This gives output for the
      visual channel.

 ť   Loudspeaker:  A strong, light paper or plastic cone that vibrates
      at frequencies of up to 20,000 times a second to cause air
      vibrations that impact on our ears.  The cone is vibrated by
      electric alternating currents running through a small coil
      suspended in a strong magnetic field.  These alternating currents
      are created, via an electronic device known as a digital-to-
      analogue convertor, from electric pulses sent from the computer.
      Typically the computer sends 0.5 million pulses per second to
      each speaker (16 pulses 30,000 times a second).  This gives
      output for the aural channel.

 ť   Force output device:  The force-feedback joystick not only sends
      pulses to computer, but also has tiny powerful electric magnets
      attached that can be activated by the computer to provide force on
      the user's hand holding the stick.  The force can be steady or
      vibratory or an impact.  The electric currents that cause the force
      are converted from streams of electric pulses sent from the
      computer.  Another form of force output device is the
      'dataglove'.  This gloves has a number of cells that exert force at
      various points of the hand, in an attempt to make the hand feel as
      though it is touching something.  This gives output for the haptic
      channel.

 ť   Direct connection to nerves:  The computer sends electric pulses
      to tiny electrodes implanted in nerve cells, and this activates
      those nerve cells, causing the user to be aware of various things
      such as snatches of music or a feeling of sadness.  This type of
      output device is still only experimental, and there are many safety
      features to design.

The electric pulses the computer sends to these devices cannot be 
properly understood until we take account of the psychic aspect.

2.5.4  Challenges and Problems

The kinds of challenges and problems explainable at this level are 
those to do with hardware, our fingers etc. and the electronics.  Here 
are some examples:

 ť   Loose connections!  Plug devices into the computer - keyboard,
      mouse, loudspeakers, screen, etc. - and if the connection is dirty,
      it won't work reliably.  This is particularly important in the most
      demanding public multimedia, because the problems of loose
      connections cannot be tolerated.

 ť   Our fingers get onto the wrong keys, and so we mis-key.

 ť   The computer's main memory has a maximum frequency (seed)
      at which it can deliver or receive and store the electric charges by
      which it works.  They are delivered via a special set of
      conductors called a bus, or 'direct memory access'.  Both central
      processor, display hardware and sound hardware must share this
      bus.  Sometimes up to half the available frequency is used by by
      the display and sound hardware, leaving only half for the central
      processor unit - this slows the CPU down by a factor of two, or
      even more.  This is particularly important in high quality
      multimedia that has a lot of high-definition animation and sound.


2.6  Seeing, Hearing, Feeling, Moving
(Psychic Aspect of HCI)

(In old versions of this lecture material, this was called the bit level.)

It is with our psychic functioning that we see, hear, feel and respond 
with motor control: it is sometimes called the sensitive or 
sensorimotor aspect.  It is the aspect of the functioning that humans 
share with animals, of raw, uninterpreted sensations, but it is the 
aspect of pattern detection and pattern recognition.  It is the aspect of 
behavioural psychology, of stimulus and response.

      It is with the psychic aspect that the issues of HCI first become
many and varied.  This version of the section consists mainly of lists, 
which will be referred to in the lectures.

      In this aspect we look at our interaction with the computer in
terms of things that our sensory capability finds meaningful: such as 
colour, sound, shape and impulses without beginning to interpret 
them.  Thinking about the UI and HCI in this way helps us 
understand some basic psychological facts such as how easily 
recognised things are, how to attract attention, and how quickly the 
user may be expected to respond to things.

2.6.1  The Psychic Aspect of Input/output Devices

From the point of view of the psychic aspect we no longer speak of 
the hardware in mechanical, electronic or organic terms; we speak 
about signals and sensory phenomena.  For each such signal (psychic 
aspect) we can use a variety of hardware devices (organic aspect), and 
each of those can work by different physical laws (physical aspect). 
For example, to send a signal about the position the user's hand or 
finger is at (e.g. to control the screen pointer), we can use devices 
like mouse, touch-pad or touch-screen or trackball.

      Figure 2 shows the signals that move around the computer in
both input and output.  You might notice that it is very similar to 
Figure 1, but there are subtle differences because, whereas Figure 1 
interpreted it from the organic aspect, Figure 2 interprets it from the 
psychic / sensory aspect.  The main differences include:

 ť   Instead of naming human organs like eyes, ears, we name human
      sensory-motor activity like seeing and hearing.
 ť   Instead of naming 'hardware', we name them 'convertor'.
 ť   Instead of EMF, we have signals.
 ť   The links between CPU, memory and UI devices are no longer
      conductors with varying EMFs, but are (bi-)directional channels,
      along which signals and byte streams are sent.  So these channels
      have arrows which show in which direction(s) signals are sent.
      For example, mouse clicks are sent from mouse to CPU.  Some
      links have signals in both directions (see below).
 ť   The bus now transmits bulk streams of bytes to and from
      memory.


                 Figure 2.  Computer with input and output at sensory aspect

2.6.2  The Output Devices: Screen and Loudspeakers

      Typically, the screen convertor, for example, is sent a 'start'
signal with the address in memory of a stream of bytes which should 
be downloaded; this stream of bytes is called a bitmap.  The screen 
convertor then begins downloading these from memory (via the bus). 
Once it receives this, it treats it as bit patterns that indicate the 
colours of pixels.   Once it has received all the bytes it needs it might 
send a 'finished downloading' signal back to the CPU.  It then sets all 
the pixels to the indicated colours, usually horizontally, line by line. 
Once it has finished all lines, it might send a 'finished displaying' 
signal to the CPU (and the program the CPU is running will then 
send the next 'start' signal but perhaps with a different address, to 
start downloading a different screenful.

      The sound convertor works in a similar way but instead of setting
pixels to various colours, it sends sound waveforms to the 
loudspeaker.  The stream of bytes in memory is called a sample or 
waveform, rather than a bitmap but as far as the memory is concerned 
they are the same - just a long stream of bytes set to various bit 
patterns.

      Consider the bitmap, for example.  Typically, the screen will
have 600 rows each of 800 pixels (that is 480000), or 1024 rows each 
of 1280 pixels (1310720), or some other number.  The bitmap 
consists of the appropriate number of cells, each of which hold a bit 
pattern, each of which is the same size, for example 8 bits (one byte). 
Each cell corresponds to one pixel on the screen.  The bit pattern in a 
cell indicates the colour that the pixel should show.  So, for example, 
the bitmap for a 1280 by 1024 screen would consist of 1310720 cells. 
This is the number of bit patterns of which the bitmap consists.

      This means that, for a 1280 by 1024 screen, in which each cell
occupies one byte, 1310720 bytes that must be streamed from 
memory to the display convertor each time the CPU says 'start'.  This 
must happen typically 50 times per second.  That means that, for a 
screen of 1280 * 1024 resolution, the display convertor must 
download 50 times 1310720 = 65536000 bytes (65Mb) per second 
from memory.  This is a not insignificant proportion of the maximum 
data speed of the bus and memory.  While the display convertor is 
downloading bitmap byte streams, neither CPU nor other things can 
access the memory.  This slows down the CPU speed of operation.

      It gets worse.  Sometimes, each pixel colour is represented by not
8 but 32 bits (4 bytes), so that over 260 million bytes must be 
downloaded per second.  This slows the CPU down even more.  The 
sound convertor also makes a drain on memory and bus (especially 
for high quality quadraphonic sound), though usually not as high as 
the display convertor.  Typically, the CPU can be running at half 
speed because it has to share the bus and memory with these 
convertors.

      In some computers and hardware configurations, steps are taken
to reduce this bus congestion.  One way is to have separate graphics 
memory, such as graphics cards do.  The bitmaps to be displayed are 
copied from main memory into graphics memory when the screen 
must change.  In the old Amiga computer, the graphics memory is 
not separate, but is part of the main memory, but there is also a part 
of main memory to which the convertors have no access, so in that 
memory the CPU runs at full speed.  With these arrangements of 
separate or special graphics memory, the bus transmits bitmap 
streams only when the screen must change.  So, for example, a screen 
showing MSWord and the user typing at one character per second, the 
screen changes only once a second, and the change is tiny.  But 
during fast animations, the screen changes up to 50 times a second. 
In such cases, graphics cards offer little benefit (though the shared 
memory of the Amiga does still offer a benefit).

2.6.3  Rendering

Each bit pattern indicates the colour the corresponding pixel should 
show.  Suppose, for a certain one-byte-per-pixel 600 by 800 
convertor, bit pattern 00101100 means white, 01100100 means red 
and 00001111 means black (I selected these bit patterns at random). 
Then:

 ť   If the whole bitmap held the pattern 00101100 repeated 480,000
      times the entire screen would show white.
 ť   If the whole bitmap held the pattern 00001111 repeated 480,000
      times, the entire screen would look black.
 ť   If the first 120,000 bitmap cells had 00101100 repeated, the next
      120,000 cells had 01100100 repeated, and the remaining 240,000
      cells had 00001111 repeated, then the screen would show a
      horizontal bar of white, a bar of red and then a wider bar of
      black.
 ť   If most of the 480,000 cells held 00001111 but some in the
      middle held 01100100 then we would see a black screen with
      some red somewhere in the middle.  By calculating which cells
      should hold 01100100, we can ensure that the red in the middle
      of the screen shows shapes that we wish, such as lines,
      rectangles, circles, spirals, or other complex shapes.

That is the principle of rendering:  Calculate which cells of the 
bitmap should hold which bit patterns.  It is the CPU that sets the bit 
patterns into the memory that holds the bitmap, and the program 
which the CPU obeys that determines which cells are set to which 
bitmaps.

      Sometimes a pixel requires 4 bytes and at other times only 1
byte.  This is because there are two main ways to indicate byte 
colours by means of a bit pattern.

 ť   Direct RGB.  The colour of the pixel is determined by how much
      red, green and blue light is emitted.  For example, yellow is seen
      by us if equal amounts of red and green are emitted with no blue.
      White is seen by us when there are equal, and high, amounts of
      R, G and B.  In the direct RGB mode, the bit pattern contains
      three numbers indicating amounts of R, G and B which the pixel
      has to show.  Usually this is 8 bits (1 byte) for each colour
      component (with a fourth byte for 'transparency', which we do
      not discuss here).  8 bits allows 256 different levels of each
      colour component, from none (0) to full (255).  Note that with
      CLUT, the display convertor must be sent a table of colours
      before any display can occur; this table of colours is called a
      Palette.

 ť   Colour-lookup table (CLUT).  The colour lookup table holds
      RGB triples for a limited selection of colours (typically 256
      colours, which is more than enough for most purposes).  This
      triple is three numbers that determine the amount of each of red,
      green and blue that should be displayed for this selected colour.
      The bit pattern held in the bitmap for each pixel is then used as
      an index into this table.  The Display convertor uses this index to
      look up the particular RGB triple, retrieves the amounts of R,G
      and B and sets the appropriate pixel to these.  In this way, each
      pixel requires only one byte in the bitmap, reducing memory and
      bus loading by a factor of four.

CLUT limits the number of colours that can occur on a screen to the 
number of entries in the table.  Usually this is 256, which is more 
than enough for most purposes, such as the desktop and web 
browsing, though other numbers are available; the Amiga allowed 
any number of colours from 2 to 256 and even a mix of these.  For 
high quality photographic images, however, 256 is not enough. 
Direct RGB allows 16 million different colours because, in principle, 
each pixel could emit any colour.

      One might think that with modern fast electronics there is no
need for CLUT, and that all should be done via direct RGB.  This is 
not true, because CLUT has certain important properties:

 ť   Good for diagrams rather than photographs.  In diagrams, often
      the exact colours do not matter, what is important is that they
      should be easily distinguished.  With direct RGB it is often all
      too easy to get blurring of colours.

 ť   Good for visually impairments.  Suppose you are red-green
      colour blind.  So in a diagram that has some red and some green,
      you can reset the palette so that the red entries are purple instead,
      and you can then distinguish them better.

 ť   Good for changing colours.  By changing the RGB triple in one
      entry in the CLUT, you immediately change the colour of all
      pixels of that entry.  This is useful in certain types of animation
      of diagrams, such as colour cycling animation.  Under direct
      RGB, you would have to find each and every pixel and change its
      colour separately, which is much slower.

      We return to the visual output channel later.  In the mean time,
we must consider input devices.

2.6.3  Input Devices

The data flow from input devices is much slower, and can often be 
transferred without the bus direct to the CPU (or some similar 
processor) as signals.  The mouse, for example, sends signals about 
how it is moving, at the rate of the order of a hundred bits or bytes 
per second.  The touch pad is similar.  The keyboard is even slower. 
Input from the microphone is somewhat faster, typically at between 
10,000 to 50,000 bytes per second - which is still far slower than the 
ouptput convertors.

      Here are a number of devices and what the user does to interact
with them (organic aspect) followed by the type of signal (whether it 
is a single bit, a byte, several bytes, etc. (psychic aspect) and then, 
anticipating the analytic aspect, what the signals encode is different 
for each device:

Device          Signal type

Input Devices:

Switches        Signal of one bit is sent every time a switch is thrown
Paper tape     As each row on tape passes the reading heads, a bit
                    pattern, one byte in size, is sent to CPU: byte stream
Cards            As each card is read, a set number of bit patterns
                    (typically 80 or 132) is sent to the CPU.
Keyboard      When a key is pressed, a single bit pattern, usually a
                    single byte, is sent to the CPU, a different pattern for
                    each key.
Joystick         When the joystick is pushed in any of 8 different
                    directions, a signal indicating the direction is sent to
                    the CPU.
Mouse          As mouse is moved, pairs of bit patterns are sent to the
                    CPU, one indicating the amount of movement left and
                    right, and one indicating the amount of movement to
                    and away.  When a mouse button is pressed, a one-bit
                    signal is also sent.
Trackball       As mouse.
Touch pad     As the finger moves across the touch pad, pairs of bit
                    patterns are sent to the CPU as for the mouse and
                    trackball.
Microphone   Bytes are sent regularly (typically 20,000 per second),
                    each byte containing a bit pattern that indicates the
                    amplitude of the waveform each 20,000th of a second
                    (or however fast).  20,000 is called the sampling
                    speed.
Camera         An entire bitmap, similar to output screens, is sent to
                    the computer, usually direct to memory rather than via
                    the CPU.

(Direct nerve connections not discussed here.)

2.6.2  Input Device Signals

The signals from the input devices can be used to input information to 
the computer.  So, for example, the pairs of bit patterns from a 
mouse are accumulated continuously by the operating system, to give 
a continuous indication of where the mouse is.  As each pair is 
received, the View is changed so that the mouse cursor is seen in a 
different position.  In this way, the mouse cursor follows the 
movement of the mouse.  Likewise, whenever a bit pattern is received 
from the keyboard, it is converted to the bit pattern for the character 
that is the key (when the K key is pressed, for example, the resulting 
bit pattern is usually either that for 'k' or for 'K' in ANSI code or 
Unicode).  Then the appropriate character is added to the View screen 
bitmap, and this gets displayed.

      Not only do the bit patterns and signals themselves produce
effects.  So can the following:

 ť   Time between events (e.g. this differentiates click and drag)
 ť   Order in which signals occur (e.g. left mouse button (LMB)
      down, mouse move, LMB up is a drag, but LMB down, LMB
      up, mouse move is a click followed by an arbitrary movement)

      In addition, the controller keeps a record of current mouse
position and also a list (history) of recent positions with their time 
stamps.  In this way, it can analyse various complex gestures.  This is 
important in pen-driven computers, where the user 'writes' letters on 
the screen and these are recognised.  Issues that concern such actions 
include:

 ť   Max time (down-up) for click
 ť   Max time (up-down) for double-click
 ť   How often mouse coordinates arrive
 ť   How often key repeats if held down
 ť   Visible and audible feedback of each action, e.g.:
       ť   Mouse cursor moves with mouse
       ť   Audible click on key presses
 ť   Complex interactions between these

2.6.3  Output channels: strengths and weaknesses

The three output channels have various strengths and weaknesses:

 ť   The visual channel gives us colour and a very sophisticated
      sensory access to the spatial aspect because the retina of the eye
      is two-dimensionsal, and also the kinematic aspect since the eye
      is very sensitive to visual changes.  This allows great precision.
      It offers a high bandwidth (high rate of information flow) because
      our the way our eyes process information is highly parallel (a lot
      of operations at one time).  For this reason it is the main output
      channel for IS.  But its main disadvantage is that it is directional -
      you need to be looking at the UI.

 ť   The aural (audio) channel of sound is not directional: you can
      hear sounds that come from any direction.  So the aural channel
      can grab the user's attention, using distinct sounds, and different
      distinct sounds can convey meaning.  This is especially useful for
      telling the user there has been an error or some unusual event.
      Often users respond better to (short) sounds, e.g. sound bites of
      comments by satisfied customers.  Sound is especially useful
      when used in conjunction with visual channel, because it can
      convey emotion and cultural associations well, and also the
      passing of time.  Sound can even convey some spatial awareness
      around the user, making them feel more 'inside' what is going
      on, which is useful for virtual reality.  But there are several
      disadvantages of sound.  Sound is transient, present with us at the
      time we experience it, but after that it is gone.  To recall it
      requires either exercise of our memory, but for most people,
      audio is not as memorable as visual media.  One answer is to
      install a replay facility.  Files of sound samples can be large (but
      music files can be smaller since they are made up of SL
      instructions e.g. MIDI, SoundTracker).  Also, sound output is a
      nuisance in an open plan environment!  It is imprecise, especially
      when there is a lot of other sound around.

 ť   The haptic channel depends on our touching the activator.
      Recent technology: force-feedback joysticks - and the vibrator in
      mobile phones.  This channel is much more limited than the other
      two in terms of the information that can be transmitted through
      it.  It is usually used, at present, to simulate or duplicate physical
      phenomena, maybe amplified or modified in some way.  e.g.
      Robot in hazardous environment: when it hits a wall, its
      controller feels an impact.  e.g. Virtual surgeon probing an
      orifice: the user feels the constriction of the orifice.

And, of course, for some people, like the blind or deaf, certain 
channels are not available.

      The most fruitful strategy is to use the channels together, sound
and perhaps touch providing an extra channel of information to assist 
the visual.  Studies have shown that using sound and video together 
increases learning when the sound is closely related to the visual (e.g. 
when text is read out loud), but greatly reduce learning when the 
sound is not related (e.g. in adverts on websites).  Be careful, 
therefore.

      Each channel offers a number of features that are meaningful in
the psychic aspect, and which can be used to convey information. 
Note:  that the lists below are not full and complete: you should try to 
add other factors.

2.6.4  The visual output channel 

The visual output channel (screen) enables us to see information 
selected from the Model in visual form.  This information is 
conveyed by means of phenomena we can see, including:

 ť   colours
 ť   shapes
 ť   spatial arrangements such as distance, length, angle
 ť   movements and other changes in the visual field.

      Since the visual channel works by setting a bitmap of pixels to
various colours, it can offer a wide range of visual phenomena, some 
purely sensory, some helping us to sense a spatial aspect, and some 
helping us to sense a kinematic aspect.

      Colours, shapes, spatial arrangements etc. are all seen by virtue
of rendering the appropriate pixel cells with appropriate bit patterns to 
set the pixel colours.  Movement and other changes are seen when 
two or more different bitmaps are downloaded by the display 
convertor, one after the other.  This gives animation or flashing or 
changing colours.

      The eye is excellent as receiving a huge amount of such
information simultaneously.  This is made possible by of the parallel 
processing that occurs in the nerves of the eye and visual cortex of the 
brain, and the tendency to detect and recognise patterns that have 
been learned.

      Learned patterns are those that we recognise without thinking,
and include, for example:

 ť   basic shapes such as short vertical line, cross, circle, etc.,
 ť   the shapes of letters and digits we learned in our early days,
 ť   the basic shape and features of the human face,
 ť   basic outline of animals, trees, etc.,
 ť   the green of vegetation, blue of sky, grey of clouds, dull colours
      of buildings, etc.,
 ť   the flickering of flames,
 ť   the three circles of red, amber, green of traffic lights,
 ť   and so on.

How the neurones of the brain remember these is called 'long term 
memory', and can be read about in psychology textbooks.

2.6.6  Basic Visual Phenomena

The visual output channel (screen) can show a variety of visual 
phenomena that can carry information.  Here we list them, so as to 
reference them later.  They fall into three main groups.

# Purely sensory visual phenomena; to do with sight, colour:

 ť   hue (red, brown, green, etc.)
 ť   saturation (how much white is mixed with the colour, e.g. pink is
      red with some white)
 ť   brightness (also called value)
 ť   texture (patterns of colour, such as grids)
 ť   colour shading
 ť   background and overall colour scheme
 ť   what colours are available (called the 'palette')

# Visual phenomena serving the spatial aspect: the eye sees whole 
shapes, but which are made up of groups of pixels all of the same or a 
related colour.  For example, if the same 100 pixels in each of the 
first 50 rows emit red light, then the user sees a red rectangle.  With 
this in mind, the arrangement of pixels of certain colours can make 
the user see the following:

 ť   shapes of any type, some of which we might recognise (such as
      the shapes of the letters of alphabet, i.e. a font)
 ť   size or length of shapes
 ť   position of shapes
 ť   distance between shapes
 ť   orientation of shapes
 ť   angle subtended by shapes, especially lines
 ť   spatial alignment of shapes, e.g. vertically above each other
 ť   shapes touching or connecting with each other
 ť   spatial patterns like surrounding or overlapping
 ť   perspective (smaller shapes seeming more distant)
 ť   background
 ť   multi-colour scenes like photographic images

# Visual phenomena serving the kinematic aspect: the computer alters 
the colours of pixels in a precisely timed way, that gives the 
impression of movement or change.

 ť   Movement of objects across the screen
       ť   Speed
       ť   Direction
 ť   Relative motion
 ť   Changes in size
 ť   Flowing motion, field motion
 ť   Colour changes, flashings
 ť   Morphing from one shape to another


2.6.5  The aural output channel 

The aural channel works by setting up waveforms in the computer's 
memory, which are sent to the loudspeakers by means of specially 
designed electronics.  Each sound starts, is sustained for a time, then 
fades away.  In musical instruments the start is called the 'attack', 
and the fading, 'decay.

 ť   Sounds in general, of which we have the following qualities:
 ť   Volume of sound
 ť   Pitch of sound (high, low; which note is played in music)
 ť   Qualify of sound, such as whether pure or fuzzy; vibrato
 ť   Rates of attach (build up) and decay (fading)
 ť   For music, or any combinations of sounds in sequence, we have
      in addition to the above:
       ť   Chords, discord
       ť   Melody
       ť   Rhythm
       ť   Tempo
 ť   For speech, which is a sequence of phonemes (the basic sounds
      of speech) we have the following qualities (see below):
 ť   accents
 ť   gender
 ť   emphasis
 ť   rising and falling of the pitch and volume
 ť   rhymes

2.6.6  The haptic output channel 

The haptic channel works by controlling a mechanical activator in 
contact with the skin.  It offers:

 ť   The feeling of resistance, e.g. as user tries to move joystick
 ť   An impulse (kick) given by computer
 ť   Vibrations of various types
 ť   'Texture' of surface (e.g. sandpaper, cloth, shiny metal)
 ť   A feel of pressure, springiness, softness, etc.

2.6.7  Making up the View


At the psychic level the challenge that HCI designers have is to make 
all output realistic.

      One important challenge lies in bringing sound and vision
together.  The main issue at the psychic aspect is 'lip sync':  the 
sound of speech must be accurately aligned with the movement of the 
speaker's lips if these are seen (or the sound of a hammer hitting an 
anvil must occur at exactly the same time as we see the hammer 
hitting it).  If this is misaligned by even a few milliseconds, we notice 
it and feel uncomfortable.  It is very difficult for computer systems to 
get the alignment so precise, because the screen typically refreshes no 
faster than every 20 milliseconds.

      Another is found in haptic output.  A 'dataglove' can, in
principle, make the hand that wears it feel anything.  For example to 
make the hand feel it is grasping a stick, gentle force would be 
exerted on the inner flesh of all fingers.  Several difficulties arise. 
One is that if the fingers are in fact straight, then the user will not 
believe they are grasping a stick, so the haptic output must take 
account of haptic input indicating the current position of fingers. 
Another similar difficulty is that if, say, only two fingers are curled, 
then the stick-grasping force should be applied only to those fingers 
that are curled.

2.6.8  Some Experiments and Theory Related to the Psychic 
Aspect of HCI

It is the field of stimulus-response psychology that has provided 
experimental results and theories to help us understand our psychic 
functioning more precisely.  It investigates our ability to detect and 
recognise patterns (visual, aural, etc.), to remember patterns, and the 
time responses we have.

      One important result is Fitts' Law.  [Fitts 1954]; see also Eberts
[1994, p.175].  How fast a human being can respond?  Participant sits 
in front of blank screen, controlling a mouse or some similar device. 
At a random time, in a random place on screen, appears a shape of 
random size.  The participant must move to hit it as fast as they can. 
The time to hit correctly is measured (in milliseconds).  Time to hit is 
found to be made up of four main components:

            Time = k1 + k2 * D * log( 1 / size )

      D = distance between where participant is aiming (e.g. mouse
      cursor) and where target appears.
      Size = diameter of target.
      k1 = a constant, which is a kind of minimum time (e.g. if large
      target appears right at where the cursor currently is), due to the
      participant's speed of noticing it and pressing finger.  k1 is
      different for each person.
      k2 = a constant, different for each person, showing how much
      distance and size affects them.
      'log' refers to the logarithmic function:  log(1) = 1, log(10) =
      2, log(100) = 3, and so on.

      Implications for HCI:  Can be useful in very dynamic, fast-
moving interfaces, such as in computer games or simulations.  For 
example, suppose, in a battle-terrain type of computer game, you are 
flying low in an aircraft over a mountainous terrain.  As you come up 
over a ridge, you have to spot enemy locations and fire at them before 
they fire at you.  Fitt's law would tell the game designer that if the 
target is large and near where you gun is already aimed, then player 
will be able to aim faster and more easily, but if the target is small 
and not where player is currently aiming, it will take longer.  So, in 
earlier, easy levels of the game, it is sensible for the computer to 
present large targets near the present aim, but on later, harder levels, 
to present smaller targets always where you are not aiming.  A good 
game is challenging but not impossible, so the game designer needs to 
know how large the target needs to be and where to place it.  Fitt's 
Law can help her/him work this out.

      This can be combined with effects that are understood under the
analytic aspect, such as the number of things a person can be aware of 
at ones: the more enemies that appear, the more difficult the level. 
See below.


2.7  Awareness of Important Things
(Analytic Functioning at the UI)

From the vantage point of the analytic aspect of HCI, we see the HCI 
and UI in terms of basic pieces of data, such as numbers, entities and 
words.  We are concerned with three main issues here:

 ť   Distinction:  How easily the user can distinguish what is
      important among what they receive psychically from what is less
      important;

 ť   Attention:  The user's awareness of, or focus upon, important;

 ť   Conceptualisation of it as various types of data.

This can be with any of the channels: visual output, sound output, 
haptic output, or motor input.

      First we look at the distinguishing of basic pieces of data by the
visual and aural channels, in text and speech.  Then we look at 
attention and focus.  Then we look at the basic types of data as such 
and the notion of 'affordance', which recognises that certain psychic 
phenomena carry certain types of information better than others.

2.7.1  Text: fonts, letters and digits

Fonts are basic shapes used to express letters, digits and other 
characters via the visual channel (printing or screen).  For a font to 
work well, it must be very easy to distinguish what letter or other 
character it refers to.  There are two important characteristics, 
especially if any of the users might be visually impaired - even 
slightly so.  The first is that lower case it is easier to distinguish 
words than with upper case, because the outline-shape differs when 
using lower case:

      cat   dog

whereas with upper case, the outline-shape is the same, just a 
rectangle:

      CAT   DOG
(
The second important thing is the actual shape of the letters or digits 
and how easily they can be distinguished from each other.  Compare 
the following fonts to see how easily you can distinguish various 
digits from 8 (the ones before the break should be more difficult in 
some fonts because those digits are slight modifications of 8).

      8283858689808 8184878
      8283858689808 8184878
      8283858689808 8184878

Notice how in the third font it is slightly easier to differentiate the 9 
and 6 from the 8, despite being smaller than the first, because the 9 
has a descender and the 6 points upwards rather than turning down. 
Arial font (not shown above) is particularly bad for digits - which is a 
pity because it is the standard one for spreadsheets.

2.7.2  Distinctions in speech: phoneme and words

Speech is composed from individual waveforms called phonemes - the 
very basic sounds of speech.  To speak text involves at least two 
stages: converting text to suitable phonemes, then rendering the 
sequence of phonemes into the sound output buffer.

      Algorithms are available for the first step, which converts text
into a string involving the International Phonetic Alphabet.  Here is 
some of the IPA:

      IY - vowel as in beet
      EH - vowel as in bet
      IH - vowel as in bit
      N - consonant as in men
      NX - consonant as in sing
      EY - diphthong as in made
      OY - diphthong as in boil

Converting text to IPA is not as easy as it sounds because it involves 
knowing how each word is to be pronounced - for example the word 
'lead' can be pronounced 'leed' or 'ledd'; which is correct must often 
be worked out from the context, and often the algorithm makes the 
wrong choice.  The challenges include:

 ť   Finding the correct consonant
 ť   Finding the correct vowel or diphthong
 ť   Adding appropriate stress to syllables
 ť   Altering intonation
 ť   Controlling the pitch and speed of speaking, and how it varies at
      various points in the sentence; e.g. pitch or speed is often
      lowered at end of a sentence and raised at the end of a question
 ť   Maximising intelligibility: for example polysyllabic words like
      'enormous' are often more intelligble than monosyllabic words
      like 'huge'.
 ť   and much more.

      The second stage involves converting the phonemes to actual
sound (or rather a waveform that is converted to sound via the 
hardware).  One way is to use a method like that used for music, use 
samples of individual phonemes.  But this has its own challenges:

 ť   Merging them together in a way that is smooth
 ť   Keeping the appropriate gaps between words
 ť   Differentiating male from female voices, and also accents
 ť   Altering the notes for stress and intonation
 ť   Raising and lowering pitch at end of sentence etc.
 ť   Raising and lowering speed.
 ť   and so on.

      Computer speech and music are an area where exciting advances
can still be made; for example, there is very little attempt at computer 
singing!

2.7.3  Attention and Focus

Miller [1956; see also Eberts 1994, p.169 ff] investigated how many 
things can we be aware of at any one time?  In a paper entitled 'The 
Magic Number Seven, Plus or Minus Two' he found the answer: 
around 7, some people can only be aware of 5, others up to 9.  If 
more things are in the visual field then we suffer a kind of 
information overload and just don't notice some of them.  This can 
explain why fiddling with the radio while driving can cause accidents.

      Of course, we usually have many more than 9 things in our
visual field, so how do we cope?  Answer: 'chunking'.  We group 
things into chunks.  For example we see a group of children playing 
by side of road: that is one chunk.  Until one of them starts running 
across the road, when that one becomes a chunk in its own right.  But 
chunks are when we recognise that a number of things 'belong 
together' and so do not distinguish between them, but distinguish 
between the thing that these make up from the rest.

      There are several challenges for HCI, user interface and
multimedia.

      Miller's result suggests:  Don't require the user to be aware of
more than five things at once; design your UI accordingly.  Design 
your graphics presentation with a maximum of five lines1 of text on 
screen.  Design your multimedia to present only five main visual 
effects.  Design your web page so that there are five main visually 
distinct areas on the page.  Group things that the user will see as of 
similar meaning: make it easy and natural for the user to chunk things 
that they should see as one group, for example by making the things 
in group a similar shape, size and colour and placing them next to 
each other.

      Of course, in games, these might be reversed, especially for
advanced levels where the player needs to be challenged.  For the 
earlier example of a battle-terrain computer game, where the player is 
flying low in an aircraft over a mountainous terrain, and they come 
up over a ridge, and have to spot enemy locations and fire at them 
before they fire at you, Miller's law would tell the games designer to 
ensure that at the easier levels, there is no more than five things of 
importance on screen (trees, buildings, enemies, etc.), but at higher 
levels, make sure there are more than nine.  At medium levels, let 
there be lots of things but they are mostly of the same type, so player 
learns to chunk them, before progressing to higher levels.

      Results like Miller's introduce the issue of clutter.  Clutter is
where the number of things is just too large and the user cannot find 
any means of chunking them.  That is, there is no obvious way in 
which the things can be meaningfully grouped together.

      Inter-channel interference.  There have also been experiments
of how different channels interfere with each other.  For example, 
movement attracts attention.  So does sound.  So if a web page, for 
example, has an animated advertisement, it will keep attracting your 
attention - and get annoying.  Sound can also interfere with, or it can 
support, what is being read.

2.7.4  Types of Raw Data and Affordance

The analytic aspect conceptualises, and each concept is raw data - 
data that is not connected with any other data (connections are the 
formative aspect).  What it conceptualises is highly varied - amounts, 
shapes, truth values, entities, relationships, structures, names, and so 
on - that is, raw pieces of data are of many types.  At the user 
interface, the sensory-psychic functioning of the user usually carries 
some types of data.  Different types of psychic functioning 'afford' 
different abilities to express these.

      Affordance is covered in Chapter VI.


2.8  Help With Achieving Your Purpose;
Structure and Relationships
(Formative Aspect of HCI)

The formative aspect is concerned with formative power, with the 
shaping or construction or structure of things, with putting things 
together rather than just leaving them is a random pile.  We look at 
structure of the UI, then of the data itself, then modification of data.

2.8.1  Structure of the User Interface Itself

The layout on screen is not (usually) random, but is structured 
(formed) in a way to help the user understand the symbols on it.  For 
example:

 ť   the middle of screen is where the main information occurs
 ť   the edges of screen is where the navigation and other general
      information occurs
 ť   the top of screen is where the title occurs

Within the main information itself we can also find structure, for 
example:

 ť   tables show similarity of things down the columns and across the
      rows
 ť   bullet lists show a collection of things that are different in a
      certain way
 ť   text is structured into linear sentences that obey a certain syntax,
 ť   ... and these sentences are visually structured by wrapping the
      text into paragraphs
 ť   box and arrows diagrams are structured such that each arrow
      must start and end at an item
 ť   ... and so on.

      The structure of the layout is very important to helping us
understand the meaningful content of the screen.  In this way the 
formative aspect serves the lingual aspect.

      In designing the structure of the screen the developer considers
what the user might want to achieve (thus considering the formative 
aspect of the HLC of the user) and tries to match the spatial structure 
on screen to that requirement.  For example, on a web page the 
navigation bullets are usually kept separated from the text on right or 
left side, so user can see them easily and know where they are. 
Sometimes they are in the middle of the text, e.g. at the foot of every 
section; this makes it very convenient.

      {*** Think:  Imagine what it would be like if your screen (e.g.
      on your mobile phone) had no structure.  Try to work out *why*
      each piece of information is where it is.  ***}

      Sound also has structure, though it is mainly a linear one.  This is
why good syntax of speech is so important, because without it we 
would get confused.  Also music has structure.


2.8.3  Styles of User Action

Style is perhaps one of the most mis-understood issues of the early 
Nineties, and the one that has the most hype surrounding it. 
Everybody nowadays is rushing to adopt WIMP, GUI, etc, and 
Microsoft made a killing in the late 1980s out of this tendency to 
jump on bandwagons.  Our aim here is to get beneath the surface and 
understand the issues involves.  There are a number of common styles 
of dialogue:

#  Commands, in which the user types in commands and supplies 
various parameters to guide their execution.  The Command style of 
dialogue is the oldest in interactive computing, and perhaps the most 
flexible.

#  Menus or Toolbars, in which the commands or objects are selected 
from menus rather than identified by name.

#  Question-and-Answer, in which the computer asks a question and 
the user supplies the answer, repeatedly.  e.g. 'Are you sure?' on 
deleting something.

#  Form filling, in which the computer puts a form up on the screen 
with a number of spaces which the user fills in.

#  Direct Manipulation (DM), in which the user selects objects 
(usually with the mouse) and identifies commands by graphical 
movement, such as drag-n-drop.

#  Control Panel, in which the computer supplies what looks like a 
control panel with knobs, etc. and the user identifies what needs to 
happen by hitting these with the mouse pointer.

      The DM style usually assumes an Object-Oriented structure,
since it is based around the idea of direct operations on selected 
objects.

      In this lecture we will look at the Command and DM styles in
more depth.  {*** You should read chapter 13 of Preece. ***}

      Locus of control (refce, 19) refers to whether the user or the
computer is in control of the interaction activity, that is whether the 
user or the computer takes the initiative for each action.  In the 
middle we have what have been called Mixed Initiative systems. 
Here we examine the issues involved.

      'Locus of control' is not a good term since the idea of control is
also present in, for instance, safety and security of data, and that is a 
different concern.  (That is, who has control over - or access rights to 
- certain data: it is important sometimes that the user does not have 
such access rights.  Examples include confidential data, keys in data 
tables, and data inside the operating system.  But this is NOT what 
we refer to when we speak of 'locus of control'.)  Therefore, we 
speak rather of spectrum of freedom - whether the user or the system 
has certain types of freedom in the activity at the user interface.

      There are several aspects of such freedom:

 ť   Action freedom - to determine what happens next
 ť   Value freedom - in the range of data values to enter
 ť   Item freedom - in the range of types of item to attend to

The various dialogue styles vary in these freedoms:

Command style:
      Action freedom: High (as wide as the commands available)
      Value freedom: High (as wide as any value that can be
      expressed)
      Item freedom: High (as wide as any item that can be indicated)

Menu style:
      Action freedom:  Only whether or not to select from menu, plus
      any actions present on the menu itself
      Value freedom:  Restricted by those offered on menu
      Item freedom:  As value freedom

Toolbar style:  As menu.

Question and answer style:
      Action freedom:  None - must answer the question presented
      (though many question panels also have other action buttons for
      e.g. help)
      Value freedom:  Little; as dictated by question
      Item freedom:  Usually none; cannot select different question

Form style (a set of questions to answer):
      Action freedom:  As question and answer style
      Value freedom:  As question and answer style
      Item freedom:  Little: can select which value(s) to enter

Direct manipulation style:
      Action freedom:  Limited to types of action available
      Value freedom:  As wide as continuous spatial movement
      Item freedom:  Can select and act on any items seen.

      Now, when should each type of freedom be allowed to the user?
In generic packages e.g. word processors and drawing packages the 
user should have total freedom in all three.  For instance, in a word 
processor, s/he should have action freedom to decide whether to type, 
erase, split paragraphs, search, cut, paste, save, load, etc.  S/he 
should have value freedom to determine what font size, what colour, 
what emphasis, etc. to give to text.  S/he should have item freedom to 
decide to what text to add to or alter, etc.

      But this is not appropriate in all software.  There are four main
reasons for restricting the freedom of the user:

      a)  When the user lacks knowledge of the domain or application
      covered by the software.  e.g.  In CBT and other training
      packages, it is usually necessary to restrict the order in which the
      user goes through the material (item freedom regarding which
      topics to learn about next).  In installation scripts it is usually
      necessary to restrict where files can be placed (item and value
      freedom).

      b)  When there is danger.  In many cases if the user is allowed
      full freedom then things would go wrong.  For instance, in an
      installation script, all the steps of installation must be completed,
      and in the correct order, so the action freedom of the user is
      normally severely limited to those actions necessary for
      installation.

      c)  When the user requires a feeling of security.  The novice user
      especially requires a helping hand and the feeling of security, or
      rather of being able to orientate themselves in 'surroundings' that
      are easily understandable - and one way of achieving this is to
      limit the number of options available in those 'surroundings'.

      d)  When more elaborate help is needed.  If the range of actions a
      user can take is wide, and the user asks for help then only a small
      amount of help can be given about each option.  If more
      elaborate help is needed then it might be appropriate to limit the
      range of options available and give more help on each.

2.8.4  Challenges for UI, HCI and MM

Psychology has researched people's ability to carry out tasks, and the 
ease with which this might be done.

      Eberts [1994, p.171] describes 'The Cooker Experiment'.  A
cooker with four burners is shown on screen, along with knobs that 
control them, with suitable labels.  Three or more layouts of burners 
and knobs are tried.  Participants sit in front of the screen, which is a 
touch-screen, and suddenly one of the burners or knobs will light up. 
The participant must then touch the associated knob or burner as fast 
as possible.  Times are measured.  It was found that if the knobs are 
laid out similarly to the burners (e.g. knobs and burners in a square, 
or knobs in a row with burners slightly askew) then response is much 
faster than the usual arrangement, with burners in a square but knobs 
in a line in front of them.

      This experiment is an example that covers two aspects.  The main
one, the type of task being researched, is the formative aspect of 
achieving things (controlling cooker plates).  But important in this is 
that the psychic aspect of pattern-recognition can greatly assist the 
formation functioning.  Without such direct assistance with the 
psychic aspect, the user has to function fully in the analytic aspect to 
conceptualise what is in front of them, and the formative aspect of 
working out what to do.

      Of course, the row-of-knobs can be learned, but it takes longer to
learn it because there is no help from the psychic aspect.

      Implications for HCI:  Ensure the controls for a device match
the various parts of the device.

2.8.5  Anticipating the Lingual Aspect

The formative aspect of HCI is to do with how the user can 
achievement what they want.  What they want to achieve is related to 
the information content that is carried, which brings us to eh lingual 
aspect.

      There are, however, two ways of achieving: distal and proximal.
These are two different types of relationship that the user has with the 
computer.  Donald Norman [1990] said:

      "The problem with the user interface is that it is an interface.
      Interfaces get in the way.
      I don't want to focus my energies on an interface.
      I want to focus on the job."

      Distal HCI is when we have to focus on the interface.  We have
to be aware of the interface itself, and plan what we do.  This fully 
involves the analytic and formative aspects.  By contrast, proximal 
HCI is when we do not have to be aware of the interface, nor do we 
have to plan what to do, because we have become so used to it that 
we can operate it and engage with it almost without thinking about the 
interface.  The analytic activity of awareness and the formative aspect 
of planning have been so well-learned that they have become tacit. 
Michael Polanyi [1967] discussed the difference between these.

      Considering the formative and analytic aspects of HCI focuses on
the interface rather than on 'the job'.  By 'the job', Norman meant 
the meaning that is represented via the symbols and their structures. 
As we consider the lingual aspect, we will be considering 'the job'. 
We will also find it links with ERM.


3.  UNDERSTANDING THE CONTENT VIA THE USER 
INTERFACE - THE LINGUAL ASPECT OF HCI

The lingual aspect of the interaction between the human and the 
computer concerns the 'signification' of the structured information 
that seen, heard, felt or input.
 ť   We see numbers arranged on a screen: what do the numbers tell
      us?  That is their lingual aspect.
 ť   We see text arranged on the screen: what is it about?  That is its
      lingual aspect.
 ť   We hear speech: what does it tell us?  That is its lingual aspect.
 ť   In a haptic UI, we feel a kick: what does it mean?  That is its
      lingual aspect.
The lingual aspect is concerned with the meaning of information 
rather than its structure or what data types are used.

      When we consider the lingual aspect of HCI, we focus on the
content rather than the technology.  When we consider the lingual 
aspect of multimedia, we focus on 'what it says' rather than on 'what 
it looks like'.  The lingual aspect of HCI links with ERM 
(Engagement with Represented Meaning).  It is the very purpose of 
most HCI, and hence and is the qualifying, aspect of HCI.  This is 
why it is in a separate section.

3.1  Foundational Dependency on Earlier Aspects

The foundational aspects serve the lingual aspect as follows, first for 
output:

 ť   Organic/biotic aspect:  Hardware devices that allow the user to
      see, hear or feel information; examples: screen, speakers, force-
      feedback joystick.  Or even direct electrical connection to user's
      nerves.
 ť   Psychic aspect:  The user sees shapes, colours, etc., hears
      sounds, feels kicks etc. which the computer generates.
 ť   Analytic aspect:  The user distinguishes what is meaningful (as a
      symbol that carries information) from what is not, and
      conceptualises the meaningful ones as certain types of
      information, such as quantities, items, qualities, etc.
 ť   Formative aspect:  User relates these pieces of information
      together and processes them.

And for input:

 ť   Organic/biotic aspect:  Hardware devices that allow the user to
      give information to or get information from computer
 ť   Psychic aspect:  Movements of input devices; sensing of output
      device signals
 ť   Analytic aspect:  Types of information that these signals represent
 ť   Formative aspect:  What user wants to achieve in giving input.

3.2  Information, Illustration and Decoration

Not everything that comes through the visual, aural or haptic channels 
has information content: some is purely for decoration.  Decoration 
functions in the psychic and aesthetic aspect but not very much in the 
lingual aspect of content.  It is useful to differentiate three main 
purposes for pieces of HCI: information, illustration and decoration. 
Text and speech are almost always for information purposes, but 
graphics, animation, pictures, colour schemes, other sound, music 
and haptic feedback can be for all three.  Think especially of a 
graphic alongside text to see the difference between them:

      When used in information role, the graphic conveys information
of its own.  A bar chart on screen is an example, as is a spoken 
sentence or a warning 'beep'.  The information is usually not 
contained elsewhere, so the graphic or sound is essential.  The 
graphic consists almost entirely of symbols.

      When used in illustration role, the picture is used to illustrate
what other text is saying, in order to make clearer the meaning of the 
text.  Illustrations often take the form of examples.  Usually 
illustrations are not essential to the material being read, but can help 
to support it.  Most of the slides that accompany these lecture notes 
are illustrative.  The graphic has many symbols (SL) but can have 
some things that are only BL, e.g. a digitized picture or a video clip 
which the user can interpret.

      When used in decoration role, the picture or sound is has very
little information meaning of its own.  An example of decorative 
graphics is found on the introductory screen of much software.  A 
musical introduction is also decorative.  Usually decorations are 
superfluous to the meaning of the document, and can be omitted 
without harm.  The purpose of decoration is often to make the 
document aesthetically pleasing, to sell it, or to provide 'atmosphere'. 
The latter is especially important in games.  (One could argue that 
atmospheric decoration actually provides information, if one stretches 
the definition of information, but we will not take that line here.) 
Decoration is entirely BL because (or when) it has no symbolic value.

      Of course, there is a spectrum or spread between the three.  e.g.
games music might give a little information about whether there are 
lots of enemies around.

      In most games, graphics and sound have important decorative
roles, but decoration is perhaps less common in the stern world of 
business software.  However, decoration can be important, in helping 
to set a context or provide light relief.  It is becoming more important 
in CBT (computer-based training). Much graphics, sound and 
animation has at least partly a decorative role, and we can expect to 
see much more decoration in future.

      In practice, information, illustration and decoration tend to
overlap, so that a given visual or sound effect will sometimes fulfil all 
three roles, or at least two.

      In this module we will focus mainly on information and
illustration, because decoration often has no symbolic content.

      The key norm of the lingual aspect is understandability:  So a
good UI is one that makes it easy for the user to understand the 
meaning of the information.

      Language is important.  In a text UI, the language in which the
text is written is important.  But languages can also be graphical; a 
diagram too, for example, has a 'language'.  For example in a bar 
chart, the user needs to understand what the bars represent, what the 
two axes represent, why bars might be grouped together or have 
different textures, and so on.  All lingual functioning at the UI 
requires the user to share the same 'language' as the UI designer. 
Otherwise there will be misunderstandings.

3.3  Link with ERM

It is the lingual functioning of HCI that is the primary link with 
ERM.  Because it is by this functioning that the user understands what 
the symbols at the UI mean, and expresses their own meaning back to 
the computer.  This is discussed at greater length in Chapter VI, §2.

3.4  Lingual Norms (Quality Criteria) for HCI

What makes a UI or human computer interaction good (or bad) from 
the point of view of the lingual aspect.  Largely, it is the same as for 
authorship of a book.  These include:

 ť   What it means should be understandable.
 ť   It should be truthful.
 ť   It should be timely, up to date.
 ť   It should be relevant.
 ť   It should make sense, and have a 'logic' in it; this does not refer
      to formal logic, but rather than what is said 'flows' well.

And so on.  There are other criteria, but they lead into the post-
lingual aspects, as follows ...


4.  POST-LINGUAL ASPECTS OF HCI AND UI

The post lingual aspects serve the lingual by affecting its style and 
how well it functions with other people.  (This is called anticipatory 
dependency in Dooyeweerd's philosophy.  See Basden [2008, p.71] 
and "http://www.dooy.info/"

4.1  Making HCI Work Across All Cultures
(Social Aspect of HCI)

The social aspect is manifested in the effect that cultural expectations, 
connotations and assumptions have on the user's ability to understand 
what the IS is telling them.  It is especially important on web pages 
because anyone in the world might have created it or be reading it. 
There are a number of issues that should be borne in mind:

 ť   Cultural connotations of words or phrases.  Some are insults in
      one culture but are perfectly innocent in another.

 ť   Idioms.  An idiom is a phrase, whose meaning cannot be derived
      from the meanings of its words.  Imagine you are a child in a
      cold climate.  "Were you born in a tunnel?", your mother
      remarks as you enter the room.  You are tempted to reply,
      jokingly, "No, in a hospital" but you know what she means, and
      you turn back and shut the door.  Tunnels are draughty places.
      But in other cultural contexts, the apparent question would not
      mean the same thing.  In Sweden 'tunnel' is 'church'.  And, you
      would only use this idiom in a family situation, never in a formal
      situation like a job interview.

 ť   Jokes and humour.  Different cultures find different things
      funny.  Avoid in-jokes on a public website.

 ť   Culture-specific words, phrases or references.  For example the
      in-phrases among Manchester United supporters, which others
      might not understand.  High-register (intellectual) words are also
      like this: words specific to intellectual culture.

 ť   Standards.  Standards are rules that have been agreed among the
      social group should be followed.  Standards exist for web
      accessibility, for example.

      When your UI is a website, it is especially important to attend to
this social aspect because your readers might come from any culture 
in the world.  Even when it is a piece of software, the same applies 
because your users might come from any culture.

      On the other hand, if you are confident that only people of a
certain culture will access your site or use your software, you can 
capitalise on their specialised cultural expectations and assumptions, 
and design it to give them better service.  For example, software 
designed for chemists probably does not need to explain what most 
chemists would know.

4.2  Managing Interface Resources Efficiently
(Economic Aspect of HCI)

The following things impose limitations on the HCI, which may be 
managed as resources.  Hence they may be seen as the economic 
aspect of the HCI, though many of the limitations arise at the psychic 
and organic level.  Each resource is of another aspect.

 ť   Screen area; this limits the number of shapes than can be placed
      on screen - especially on a mobile phone.  Because the collections
      of electronics that correspond with a pixel are limited, so are the
      numbers of pixels on screen.  So, at the bit level we speak of, for
      example, 1280 by 1024 pixels, which is what we call its
      resolution.  This is a spatial resource.

 ť   Rendering speed (rendering is the process of making up the
      screen before it is displayed); this limits the speed at which
      animations can occur.  Each rendering process involves
      calculations.  Calculation is of the formative aspect, so rendering
      speed is a formative resource.

 ť   Bus speed.  Maximum rate of computer's internal electronic bus,
      memory or CPU; high resolution screens with many colours and
      long sound samples, consumes a lot of the bus bandwidth, thus
      limiting the speed at which the CPU can operate to process
      programs.  The speed at which pixels can be sent to the screen is
      also limited by the speed at which the electronics can change the
      colour-state of each cell.  So there is a maximum refresh rate for
      each type of screen.  Typically this is 50 or 75 times per second
      for a full screen.  PAL TV works at 50 times a second.  This is a
      physical resource.

 ť   Network speed.  The speed at which two computers can
      communicate is also limited at the hardware level by the highest
      frequency at which the wired or wireless connection can operate -
      and this too limits the speed at which, for example, files can be
      downloaded from a network.  This would seem physical aspect,
      but what is important is not the number of bits transferred per
      second, but the number of pieces of data.  This is an analytic
      resource (pieces of data per unit time).

 ť   Frequency range of human ear; this limits the range of sounds
      that may be used.  The highest frequency that the human ear can
      hear is (depending on age) from 5000 to 20000 Hz (cycles per
      second).  This limits the useful frequency range for sound output.
      This is a psychic limitation.

 ť   Maximum information rate, of absorbing new information; this
      limits the speed at which visual field can change and at which
      sounds can be made.  There are three aspects of this limit.  One
      is psychic: the eye and ear and their nerves have limits on how
      fast they can work.  One is analytic: we are limited in how many
      pieces of data we can cope with at one time: 'The magic number,
      plus or minus two' [Miller, 1955].

 ť   Input channel width; Input channel width is limited.  It is the
      number of different signals that can occur (e.g.
       ť   with digital joystick there are 8 directions plus two buttons,
            giving 10 different signals;
       ť   with mouse there are two buttons; this allows only 3
            different signals (LMB, RMB, both together)
       ť   with keyboard there are say 80 keys, and these can be
            modified by qualifiers like Shift, Ctrl, which can be used in
            combinatinos, giving typically 640 (8*80) different signals)
      All of these can be employed to allow the user to signal his/her
      intended actions to the controller.  But usually only a tiny subset
      of them are actually used.  There is growing interest in two-
      handed input using mouse in one hand and keyboard or trackball
      in the other.

 ť   Human impatience; this means that the user will not wait many
      seconds for what the computer is doing - e.g. download time -
      unless they know of a good reason why it should take a long
      time.  Patience is probably a pistic matter (we get more impatient
      if we are arrogant) or ethical-selfgiving matter (we don't want to
      give others time).

All these are important in multimedia.  Though each is a resource in a 
particular aspect, the fact that there is a limit is the economic aspect 
of HCI.

      {***  As you use your computer or mobile phone, try to think of
      other things that are limited.  Which aspect are they?  ***}

      The economic relationship between human and computer is not
symmetric between input and output.  The human is good at detecting 
and recognising visual and aural patterns, but the computer is poor at 
doing so.  This means that the computer can generate speech and 
visual patterns and the user will usually know what is meant, but 
speech recognition by computer is hard.  Similarly, the computer is 
good at reliably performing fast actions and doing calculations, while 
the human is slower, less reliable and limited at calculation.  So the 
computer can be made to express much information at one time (e.g. 
on one screen) whereas the human user would take some time to 
express it all.  Further, the human user has intentions while the 
computer (usually) does not.


4.3  Interesting, Enjoyable, Harmonious Interactions
(Aesthetic Aspect of HCI)

The aesthetic aspect of HCI refers to the enjoyment, or otherwise, 
which the user experiences from the interaction itself.  This is 
different from the enjoyment they experience from engaging with the 
meaning or from life itself while using the IS (which are an aesthetic 
aspect of ERM and HLC).

      The aesthetic aspect of HCI covers harmony, fun, beauty, style,
humour and interest.  The aesthetic aspect of HCI is typically focused 
in the graphic and multimedia design of the interface.  Here are some 
examples of harmony:

 ť   colour schemes that harmonise,
 ť   layout that is balanced
 ť   animations that subtly contribute to the overall effect rather than
      distracting attention
 ť   sounds that go well with the visual UI.

Here are some examples of beauty, fun, humour, interest:

 ť   backdrop: nice-looking, humorous or interesting
 ť   the idea of using a 'paper clip' with a face to give advice
 ť   nice-looking colours
 ť   style of writing in text
 ť   text tells an interesting story

But after a time, these things pall and get annoying.

      It it very tempting to focus on these for their own sake (especially
graphic design), leading to one of those stunning user interfaces (e.g. 
web sites) that look good but do not give useful information. 
Remember, in HCI, all the aspects should so function as to serve the 
lingual: they should enhance the communication of information to and 
from the user.  This is the major theme of that authority on graphic 
design, Edward Tufte [1990].

      Beware:  The HCI involves more than the main content.  For
example, consider a results page offered by a search engine.  You 
have several tranches of content:

 ť   the main content (a list of articles found from the search)
 ť   helpful suggestions for what else you might try
 ť   navigation buttons or links
 ť   advertisements
 ť   administrative information like contact details.

The main content might be aesthetically pleasing, but what about the 
overall effect?  Designers and regular users of a UI tends to think 
about only the main content, but other users and occasional observers 
see the whole.  So beware lest the adverts, for example, detract.  In 
HCI that is good according to the aesthetic aspect, all these will 
harmonise, they will be relevance to each other.  The most recent 
search engines try to make all this relevant to your original search - 
harmonising with it.  Earlier ones did not, and the user would get 
annoyed by the disharmony between what they were searching for and 
the advertisements etc.

      What has been said about harmony among the content of a web
page applies equally to meaning-content of a computer game, a 
database, and so on.

      {***  Exercise:  Look for harmony in the meaning-content of a
      computer game (its gameplay), or a database.  Look for harmony
      and fun in the graphics of the game.  ***}

      Humour of the HCI does not refer to humour found in the
meaning-content, but refers to what it is about the user interface itself 
that makes you laugh.  It is, unfortunately, rare.  However, I 
encountered an example of it in the early Amiga operating system, in 
that if you pressed keys in certain combinations and sequences, you 
were rewarded by hidden messages appearing on the screen, such as 
"We designed the Amiga, but Commodore ***d it up".  These were 
removed in the next version of the operating system!  More useful 
humour could be in the placement or shape of buttons or menus, 
alluding to various things that are known in the culture e.g. the 
Simpsons.  {*** Designers of UIs: humour in the HCI might be an 
opportunity to make your mark! ***}

      But beware:  in some serious applications fun is not appropriate;
this is a matter under the juridical aspect, below.  But where fun or 
humour is inappropriate, you can still design an aesthetic UI by 
attending to harmony and interest and style.

      Many of the rules of art are applicable to HCI, especially C.S.
Lewis' aphorism, "In art, less is more".  What this means is that the 
most aesthetically effective things are those that do not shout, but 
provide subtle effects.  This is why, for example, animations or 
sounds that distract are bad aesthetically.

4.4  Doing Justice to Both User and Information
(Juridical Aspect of HCI)

The juridical aspect is concerned with 'what is due', i.e. with what is 
appropriate and proportional.  In HCI, it is concerned with

 ť   what is due to users
 ť   what is due to the information (though that extends into ERM).

      The main way in which appropriateness to the user comes to the
fore is when considering disabled users.  For example:

 ť   UIs in which it is impossible to enlarge the font are not giving
      visually handicapped users their due of larger text.
 ť   For totally blind people, the UI should be able to speak its text,
      and describe anything else of importance.

Web Accessibility Guidelines are an attempt at giving due to disabled 
people.

      Appropriateness to the information meaning-content.  For
example, sad news of somebody's death should not be accompanied 
by a flashy advertisement.  Relevance is important: what is placed on 
screen should be relevant, and the facilities made available should be 
appropriate to what the users might want to do.

4.5  Generosity and Courtesy of the Interface
(Ethical Aspect of the HCI)

Ethicality does not mean good or bad.  It means generosity and self-
giving.  A UI that is good in the ethical aspect is one that is generous. 
This issue has not been widely studied, so we cannot say much about 
it, except for two things:

 ť   Make facilities available to the user that will be helpful, but are
      beyond the basics required.
 ť   have duplicate menu entries.

For example, my word processor has a menu to do with Blocks (what 
it calls selected text), which allows me not only to set, delete and 
copy blocks (the basics), but also copy a block from another 
document (very useful, but quite common).  An 'extra' that is 
surprisingly useful is the facility to count the words in a block!  Also, 
there is a facility to save a block (useful for starting a new document) 
and and formatting the text in a block.  But in this menu there is no 
facility to print a block.  For that facility I have to move to the Print 
menu.  And the facility to spell-check a block is in the Spell menu.  It 
would be nicer (more generous) if the Print Block and Spell-check 
Block facilities were to be in the Block menu as well as in the other 
menus.

4.6  The Vision Behind the Interface
(Faith Aspect of HCI)

The faith aspect refers to the 'vision' behind or underlying the design 
of the HCI and UI and the beliefs and commitments of the user and 
designer.  The expectations and assumptions of both user and designer 
are the faith aspect, and also what the user or designer beliefs to be 
'good' (or bad) or meaningful.  These are usually taken for granted, 
but can deeply affect the quality of the HCI.  They are usually social 
in nature: shared assumptions.

      For example:  When you get used to one word processor, you
find others difficult to use.  For example:  what are called 'holy wars' 
(note the faith-oriented language!) are waged between supporters of 
different platforms, such as Apple Mac, Linux, Amiga, and all 
denigrate the Windows platform; such denigration is partly of the 
faith aspect, partly of the ethical aspect (where it is dysfunctioning).


6.  LINKS WITH OTHER IDEAS ABOUT HCI AND UI

6.1  Practical Guide to Usability

In their A Practical Guide to Usability Testing, Dumas and Redish 
[1999,p.4] define usability as:

      "Usability means that the people who use the product can do so
      quickly and easily to accomplish their own tasks.  This definition rests
      on four points:
            1. Usability means focusing on users.
            2. People use products to be productive.
            3. Users are busy people trying to accomplish tasks.
            4. Users decide when a product is easy to use." 

At first sight, there is a laudable focus on the human being, but closer 
examination reveals a heavy emphasis on the economic aspect: 
'product', 'quickly', 'tasks', 'use products to be productive', 'busy', 
'accomplish tasks'.  Many elements of it would, thus, be irrelevant 
for many types of IS, such as games, unless one distorts the meaning 
of words like 'productive'.  On the other hand, many websites today 
are designed mainly to look stunning, often at the expense of 
usability; are the web designers wrong, right, or what?

      The proposal here is that neither Dumas and Redish's emphasis
on the economic nor web designers' emphasis on the aesthetic are 
right or wrong in themselves, because these are just two among many 
aspects.  The shalom principle (which states that things work well 
when every aspect is upheld and given its due) applies to HCI, and 
absolutization of any aspect will jeopardise usability.

6.2  Winograd and Flores:  Direct Engagement

Soon after it was published, Winograd and Flores' seminal work 
Understanding Computers and Cognition [1986], made a profound 
impression on this author.  It is worth reading, being very easy to 
read even though it is written at a philosophical level, and saying 
thought-provoking things in its beginning and end, though the middle 
of the book rather loses its way, in my opinion.  The book did not so 
much create a new way of looking at computers in him, as undergird 
and express what he had already felt and believed for over ten years. 
Moreover, though Polanyi's 'tacit dimension' [1967] was his 
mainstay at the time, Winograd and Flores also helped him 
understand the difference between distal and proximal user interfaces 
(discussed earlier) before he discovered Dooyeweerd's philosophy and 
the aspects.

      This text is based on some in the author's book 'Philosophical
Frameworks for Understanding Information Systems', and has a 
philosophical feel.  It is offered here so that students can delve deeper 
if they wish to.  It continues in Appendix 2, which discusses W+F at 
a philosophical level, but most of it is hopefully able to be understood 
without a knowledge of philosophy.

      Winograd and Flores (W+F) questioned the prevailing
'rationalistic' approach to computers especially found in AI and 
suggested an approach based on Heidegger's existentialism, 
phenomenology, hermeneutics and language theory, which were all 
types of philosophy.  W+F, using Heidegger, challenged the way 
computers were understood in terms of the Cartesian subject-object 
relationship, as objects distal from, and operated upon by, humans. 
In place of this they offered the ideas of 'thrownness' and 
'breakdowns' based on Heidegger's notion of being-in-the-world. 
W+F's second challenge was to the assumption that cognition is the 
manipulation of knowledge of an objective world, and that we can 
hope to construct machines that exhibit intelligent behaviour (as AI 
hoped to do).  Instead, using Maturana's notion of autopoiesis, they 
argued that cognition is an emergent property of biological evolution 
and that interpretation arises from cognition, and that computers 
themselves can never be made truly intelligent.  Their third challenge 
was to assumptions that language is constituted in symbols with literal 
meanings, that such symbols can be assembled into a knowledge base, 
and that they are used within organisations as a means of transmitting 
information.  Instead, in accord with Searle's speech act theory, the 
listener actively generates meaning especially as a result of social 
interaction, and language is action, responsible for creating social 
structures, not just being used within them - this is now called the 
'Language Action Perspective'.  It is impossible, they argued, for 
computers to use language in the way humans do (even though they 
might process natural language).

      For more on this, and how Dooyeweerd's aspects fit into it, see
my 2008 article with the late Heinz Klein, 'New Research Directions 
for Data and Knowledge Engineering: A Philosophy of Language 
Approach' (Data & Knowledge Engineering, 67(2008), p.260-285).

      W+F suggested 'A new foundation for design' of computer
systems.  The aim of AI, KBS and HCI should be redirected, away 
from an attempt to make computers 'intelligent' or to support 
'rationalistic problem-solving', towards building useful systems that 
are "aids in coping with the complex conversational structures 
generated within an organization" [p.12].  They continue, "The 
challenge posed here for design is not simply to create tools that 
accurately reflect existing domains, but to provide for the creation of 
new domains."  This, they hope, will open the way to social progress 
and "an openness to new ways of being" [p.13].  They outline the 
design of a Coordinator system to support cooperative work.

      W+F's work is still avidly discussed, and even inspirational, 20
years later [Weigand, 2006].  It deserves to be because it provides a 
framework for understanding three of the areas of research and 
practice (HUC, nature of computers and ISD) and touches on that of 
technological ecology.  It is seen as a flagship of the Language-Action 
Perspective, which focuses on computer use in organisations and 
especially the use of language in changing them.

7.  A Philosophical Look at HCI

      These aspects may be seen from either the user's or the
computer's point of view as shown in Fig. 1.


            Figure 1.  Aspects of HCI from user's and computer's point of view

7.2  The Central (Qualifying) Aspect of HCI

Which is the most important aspect of HCI?  Answer: they all are. 
That is not a very useful question.  It is better to ask ...

What is the main purpose of interacting with a computer?  Which is 
the central aspect of HCI?  We want an answer that is valid, whatever 
the application; in this way, the answer does not depend on ERM 
(what the information is about) or HLC (how our lives are affected by 
it).

Answer:
      The main purpose of interacting with a computer is usually to
      gain and give information which the user can understand.
      So the central aspect of HCI must be the lingual.

      (NOTE:  Identifying the central aspect is done by thinking about
what its main purpose or meaningfulness is.  Dooyeweerd called it the 
qualifying aspect.  Identifying the central, or qualifying, aspect 
simplifies a complex picture.  The qualifying aspect is the one that is 
most important in giving a thing its meaning, it destiny in life, and by 
which we should judge whether it is good or bad at being that type of 
thing.  For example, a law court is qualified by the juridical aspect, a 
business is qualified by the economic aspect, a pen is qualified by the 
lingual aspect, and so on.  {*** If you want to find out more about 
qualifying aspects, see 'Philosophical Frameworks for Understanding 
Information Systems' [Basden, 2008], pp. 86, 132 ff. ***})

      The qualifying aspect of HCI is usually the lingual for all
information systems.  This is because, regardless of application, the 
main thing we experience is symbols on the screen (or heard from 
speakers) that signify something, and in the actions we make that 
signify what we want the computer to do.  Note that this does not just 
mean text, but can be any channel; see below.

      The meaning is, of course, that which is represented in the
computer and if the HCI is of high quality then the user engages with 
this represented meaning.  As will be discussed at greater depth in 
Chapter VI, the lingual aspect is the main link between HCI and 
ERM (Engaging with Represented Meaning) - which is discussed in a 
separate chapter.  The lingual aspect 'reaches out' to all aspects of the 
meaningful content, to represent all types of meaning.

      So, in most cases of computer use, the lingual aspect is the most
important aspect of HCI.  Our functioning in all the other aspects of 
HCI is mainly to serve the lingual functioning so that it is effective in 
expressing and interpreting meaning.

      (Very occasionally, the lingual is not the most important, but
these are rare.  One example would be computer-controlled disco 
lighting; here the human's interaction with the computer is primarily 
psychic, and contains no symbolic meaning.  But we will ignore such 
specialised applications here.)

7.3  Other Aspects Serve the Lingual

The lingual aspect of HCI (or indeed of anything) cannot work well 
without all the other aspects, especially those that are its nearest 
neighbours.  The pre-lingual serve the lingual as follows:

 ť   The importance of the formative aspect in HCI lies in how it
      helps structure the information presented to the user.  Think what
      it would be like if the information on screen was in random
      places, with no structure.

 ť   The importance of the analytic aspect in HCI lies in ensuring
      clarity in the information presented.  Think of what it would be
      like if the information on screen was unclear.

 ť   The importance of the psychic aspect lies in ensuring that the user
      can see or hear what is presented.  Think of what it would be like
      if text had the same colour as background!

The post-lingual aspects serve the lingual as follows:

 ť   The importance of the social aspect of HCI lies not in the social
      intercourse that occurs when driving the computer (such as
      children gathering round a games player, which is HLC), but in
      whether the user understands the cultural connotations of, or
      assumptions behind, what is shown on the screen (or heard
      through the speakers), and with the standardisation of things like
      user interface style.

 ť   The importance of the economic aspect of HCI lies not in the cost
      of the building that Elsie calculated (which is ERM) but in such
      things as the effect of limited screen area: only a certain amount
      of information is visible.

 ť   The aesthetic aspect of HCI concerns how the harmony and
      artistic style of the UI helps users properly understand what the
      UI is presenting.

 ť   The juridical aspect concerns whether the UI does justice to the
      represented meaning, and so on.

      Dooyeweerd called this inter-aspect dependency, and it goes in
two directions: foundational and anticipatory.  The lingual aspect thus 
depends foundationally on the aspects earlier than it, especially the 
formative, the analytic, and psychic, and it anticipates the later 
aspects, especially the social, economic, aesthetic and juridical.

      Most of the aspects of HCI serve the lingual function of
understanding what is presented via the UI and responding.

7.4  What is Good and Bad in HCI

A norm is what is good, to be aimed for.  Usability and ease of use, 
for example, is usually good and a thing to aim for.  But what exactly 
is ease of use, and how can we evaluate it or design for it?  It is now 
acknowledged to cover many factors, which can be understood multi-
aspectually.
      Table 1 lists several normative factors under each aspect.

                                Table 7.  Aspects of usability


      These are some of the things by which we could judge the UI or
HCI.  But in each aspect you will find more if you need to.  And you 
could add the ethical and pistic aspects if you wish.

      Sometimes there might seem to be conflict between such
aspectual norms.  For example, the juridical norm of appropriateness 
can make it difficult to standardise the style of UI [Basden, Brown, 
Tetlow and Hibberd, 1996].  One way to resolve this is to take into 
account the qualifying aspect of HCI.  If, as suggested earlier, this is 
the lingual aspect, then its norms of conveying information, 
understandability and truth-telling should always be honoured. 
However, the lingual norms should not themselves be absolutized, 
because HCI only gains its meaning by referring beyond itself to 
ERM and HLC.

      Note:  Sometimes it is appropriate to break the rules.  Especially
in computer games or other fun software.  For example, the rule that 
all important symbols should be clearly seen (psychic, analytic 
aspects) is reversed in games, where the best weapons or equipment 
are hidden and difficult to see.

7.3  Aspects as Checklist: Guidelines for UI

While it is appropriate on occasion to focus attention on one aspect 
(usually the qualifying) we should always do so in a way that gives all 
the other aspects their due.  If we over-emphasise an aspect we begin 
to ignore other aspects, and the result is that the success or 
fruitfulness of our activity is jeopardised.  Thus, for example, a web 
page that has superb graphics but is otherwise devoid of useful 
content it will fall into disuse.

      Web pages are user interfaces, and we can see the normativity of
many of the aspects recognised in the more mature published web 
design guidelines.  Table 2 shows the 'Research-Based Web Design 
and Usability Guidelines' of the National Cancer Institute [2005] and 
the main aspects of each guideline (aspects indicated by the first letter 
of their name, from Q = Quantitative to P = Pistic).  Many have two 
aspects, sometimes because they cover two things (e.g. "set goals" 
(formative) and "state goals" (lingual)) and sometimes because the 
main idea is of two aspects (e.g. sharing is both lingual and ethical). 
We do not differentiate between qualifying and founding aspects here, 
but could do if a more precise analysis were needed.

                         Table 2.  Aspects of Web Design Guidelines


{***  Think about, and discuss, the following:
 ť   Which aspects have most entries?
 ť   Why do you think this is?
 ť   Which aspects have least?
 ť   Why do you think this is?
 ť   Why are the formative and spatial aspects so important in web
      accessibility?
***}

      We can use aspectual analysis as a basis for critique.  The first
thing that strikes us is how many aspects are represented here.  This 
is, of course, what one would expect from a good, mature set of 
guidelines such as the NCI guidelines are.  Second, we might look for 
imbalance among the aspects.  The spatial and formative aspects 
appear more often than most other aspects; we can ask ourselves 
whether this is appropriate.  Perhaps more significant are some gaps, 
at least in this 2005 version, some of which are quite surprising:

 ť   The faith aspect of vision of who we are is completely absent, yet
      one might expect some mention of the designers' vision for the
      website.  (It is possible that "Set goals" implies some pistic
      vision for the site.)

 ť   The ethical aspect of self-giving is present only in sharing design
      ideas.  Guidelines on how to give the reader more than is actually
      due to them, and thus create a site that feels generous, would be
      useful.

 ť   The juridical aspect is almost absent, only represented
      tangentially in the concept of providing 'useful' or meaningful
      content.  The juridical aspect would be relevant in terms of
      giving both the topic and the readers their due.

 ť   Perhaps most surprising is the almost complete absence of the
      social aspect - the two inclusions are rather tangential.  Since
      websites are read by people from any and every cultural group,
      with varying background knowledge, expectations and world
      views, we might expect a whole set of guidelines on appropriate
      use of cultural connotations, humour, idiom, and on respecting
      cultural sensitivities.

 ť   The kinematic aspect is almost entirely absent.  Animation can be
      used to show movement, but have the designers of these
      guidelines overlooked this, treating animation as a mere sensitive
      or aesthetic decoration?

      This aspectual analysis of these guidelines is not meant primarily
as a criticism of the guidelines, which are excellent when compared 
with many others that are available, but rather to show how aspectual 
analysis can be useful as an evaluation tool, and how it might be used 
to suggest future improvements.


Copyright (c) Andrew Basden,
16 September 2008, 18 October 2008. 3 September 2009, 22 
September 2009, 25 November 2009, 20 September 2010.