CHAPTER VI.
THE PSYCHIC / SENSORY ASPECT OF
EwT - ENGAGING WITH THE TECHNOLOGY
HCI - HUMAN-COMPUTER INTERACTION 

In this chapter we look at the psychic aspect of the interaction 
between human user and the computer or other information system as 
such, mostly considering the user interface (UI).  We will look at 
each aspect in turn.  We single out this aspect for a single chapter 
because it contains a considerable amount of material.  This is 
because the psychic aspect is the 'gateway' of the human being in 
their environment, and the most immediate engagement with the user 
interface.  There is a considerable amount to try to 'get right' in this 
aspect.

VI-1.  The Psychic / Sensory Aspect of EwT/HCI and UI: 
Seeing, Hearing, Feeling, Pushing, Touching 

It is with our psychic functioning that we see, hear, feel and respond 
with motor control: it is sometimes called the sensitive or 
sensorimotor aspect.  It is the aspect of the functioning that humans 
share with animals, of raw, uninterpreted sensations, but it is the 
aspect of pattern detection and pattern recognition.  It is the aspect of 
behavioural psychology, of stimulus and response.

      It is with the psychic aspect that the issues of EwT/HCI first
become many and varied.  This version of the section consists mainly 
of lists, which will be referred to in the lectures.

      In this aspect we look at our interaction with the computer in
temcs of things that our sensory capability finds meaningful: such as 
colour, sound, shape and impulses without beginning to interpret 
them.  Thinking about the UI and EwT/HCI in this way helps us 
understand some basic psychological facts such as how easily 
recognised things are, how to attract attention, and how quickly the 
user may be expected to respond to things.

      Here is a summary of the human psychic / sensory functioning
and what is meaningful about each device, and the types of signals 
between them and computer.  These are explained later.

Device          Human function      Signal type

Input Devices:

Switches        Flick                     One bit sent
Paper tape     ('see' holes)           Byte stream (one-byte bit pattern
                                                per row of holes)
Cards            ('see' holes)           960 bits (for 80-column cards) sent
Keyboard      Press                    Single byte bit pattern per key-
                                                press, a different pattern for each
                                                key.
Joystick         Push                     Bit-pattern indicating direction of
                                                push.
Mouse          Move                    Pairs of bit patterns are sent per
                                                tiny incremental movement of
                                                mouse.
Trackball       Move                    As mouse.
Touch-screen Stroke                   Pairs of bit patterns per each tiny
                                                movement of finger.  More if
                                                more fingers.
Touch-pad     Stroke                   As touch-screen.
Microphone   Speak                   Regular byte stream (typically
                                                20,000 per second) sampling the
                                                sound.
Camera         Operate                 An entire bitmap, similar to output
                                                screens, is sent to the computer,
                                                usually direct to memory rather
                                                than via the CPU.
Output devices:

Screen          See shape, colour    Entire bitmap 50-75 times a
                                                second.
Loudspeakers Hear                     Linear byte array 30,000/sec
Actuators       Feel                      Byte stream.

(Direct nerve connections not discussed here.)  

From the point of view of the psychic aspect we no longer speak of 
the hardware in mechanical, electronic or organic temcs; we speak 
about signals and sensory phenomena.  For each such signal (psychic 
aspect) we can use a variety of hardware devices (organic aspect), and 
each of those can work by different physical laws (physical aspect). 
For example, to send a signal about the position the user's hand or 
finger is at (e.g. to control the screen pointer), we can use devices 
like mouse, touch-pad or touch-screen or trackball.

      Figure 2 shows the signals that move around the computer in
both input and output.  You might notice that it is very similar to 
Figure 1, but there are subtle differences because, whereas Figure 1 
interpreted it from the organic aspect, Figure 2 interprets it from the 
psychic / sensory aspect.  The main differences include:

 ť   Instead of naming human organs like eyes, ears, we name human
      sensory-motor activity like seeing and hearing.
 ť   Instead of naming 'hardware', we name them 'convertor'.
 ť   Instead of EMF, we have signals and streams.
 ť   The links between CPU, memory and UI devices are no longer
      conductors with varying EMFs, but are (bi-)directional channels,
      along which signals and byte streams are sent.  So these channels
      have arrows which show in which direction(s) signals are sent.
      For example, mouse clicks are sent from mouse to CPU.  Some
      links have signals in both directions (see below).
 ť   The bus now transmits bulk streams of bytes to and from
      memory.


                 Figure 2.  Computer with input and output at sensory aspect

In this chapter we look at input from user to computer first, then 
output from computer to user.


VI-2  INPUT 

VI-2.1  Input Devices

The data flow from input devices is much slower, and can often be 
transferred without the bus direct to the CPU (or some similar 
processor) as signals.  The mouse, for example, sends signals about 
how it is moving, at the rate of the order of a hundred bits or bytes 
per second.  The touch pad is similar.  The keyboard is even slower. 
Input from the microphone is somewhat faster, typically at between 
10,000 to 50,000 bytes per second - which is still far slower than the 
ouptput convertors.

      Here are a number of devices and what the user does to interact
with them (organic aspect) followed by the type of signal (whether it 
is a single bit, a byte, several bytes, etc. (psychic aspect) and then, 
anticipating the analytic aspect, what the signals encode is different 
for each device:

Device            Functioning         Signal type

Input Devices:

Switches        Flick                   Signal of one bit is sent every time a
                                               switch is flicked
Paper tape      ('see' holes)        As each row on tape passes the reading
                                               heads, a bit pattern, one byte in size, is
                                               sent to CPU: byte stream
Cards             ('see' holes)        As each card is read, a set number of
                                               bit patterns (typically 80 or 132) is sent
                                               to the CPU.
Keyboard        Press                  When a key is pressed, a single bit
                                               pattern, usually a single byte, is sent to
                                               the CPU, a different pattern for each
                                               key.
Joystick          Push                   When the joystick is pushed in any of 8
                                               different directions, a signal indicating
                                               the direction is sent to the CPU.
Mouse            Move                  As mouse is moved, pairs of bit
                                               patterns are sent to the CPU, one
                                               indicating the amount of movement left
                                               and right, and one indicating the
                                               amount of movement to and away.
                                               When a mouse button is pressed, a
                                               one-bit signal is also sent.
Trackball         Move                  As mouse.
Touch-screen  Stroke                As the finger moves across the touch
                                               pad, pairs of bit patterns are sent to the
                                               CPU as for the mouse and trackball.
                                               More if more fingers.
Touch-pad      Stroke                As touch-screen.
Microphone     Speak                 Bytes are sent regularly (typically
                                               20,000 per second), each byte
                                               containing a bit pattern that indicates
                                               the amplitude of the waveform each
                                               20,000th of a second (or however
                                               fast).  20,000 is called the sampling
                                               speed.
Camera           Operate              An entire bitmap, similar to output
                                               screens, is sent to the computer,
                                               usually direct to memory rather than via
                                               the CPU.

Output devices:

Screen            See shape, colour See below.
Loudspeakers  Hear                   See below.
Actuators       Feel                    See below.


VI-2.2  Input Device Signals

The signals from the input devices can be used to input information to 
the computer.  So, for example, the pairs of bit patterns from a 
mouse are accumulated continuously by the operating system, to give 
a continuous indication of where the mouse is.  As each pair is 
received, the View is changed so that the mouse cursor is seen in a 
different position.  In this way, the mouse cursor follows the 
movement of the mouse.  Likewise, whenever a bit pattern is received 
from the keyboard, it is converted to the bit pattern for the character 
that is the key (when the K key is pressed, for example, the resulting 
bit pattern is usually either that for 'k' or for 'K' in ANSI code or 
Unicode).  Then the appropriate character is added to the View screen 
bitmap, and this gets displayed.

      Not only do the bit patterns and signals themselves produce
effects.  So can the following:

 ť   Time between events (e.g. this differentiates click and drag)
 ť   Order in which signals occur (e.g. left mouse button (LMB)
      down, mouse move, LMB up is a drag, but LMB down, LMB
      up, mouse move is a click followed by an arbitrary movement)

      In addition, the controller keeps a record of current mouse
position and also a list (history) of recent positions with their time 
stamps.  On tablet computers, these come from the finger's position. 
In this way, it can analyse various complex gestures.  This is 
important in pen-driven computers, where the user 'writes' letters on 
the screen and these are recognised.  Issues that concern such actions 
include:

 ť   Max time (down-up) for click
 ť   Max time (up-down) for double-click
 ť   How often mouse coordinates arrive
 ť   How often key repeats if held down
 ť   For touch-screen devices, whether one or two fingers are used
 ť   Visible and audible feedback of each action, e.g.:
       ť   Mouse cursor moves with mouse
       ť   Audible click on key presses
 ť   Complex interactions between these

      In touch-screen devices, there is no click or double click, only
the detection of the finger hitting the screen, then a sequence of 
movements.

Examples:

 ť   The familiar double-click gesture with the mouse involves
      pressing mouse button down and releasing it quickly and then
      repeating that.

 ť   On tablet, phone, or other touch-screen devices, there are many
      gestures, for example two fingers moving apart often causes the
      screen to zoom in.

Exercises:

 ť   Find out what differentiates a double-click from two separate
      clicks; is it that the two clicks are at the same place on the screen
      or is it that they occur within a given time?  If spatial, find out
      how far you can move the mouse between one click and the other
      before it becomes two separate clicks.  If based on time, see how
      far you can extend the time between clicks before they become
      two separate clicks.

 ť   Try to find gestures on your touch-screen device that you have
      not yet used.  Determine how usable and how useful they are.

How do I design for this?:  If you are designing gestures then order, 
timing and position are all important.  You have to keep a history 
(list) of positions (mouse of finger) and try to work out what the user 
intended; this is analytical aspect, below.  You also have to consider 
the size of the human hand of every possible user, from child to 
elderly person with arthritis, slow responses or even missing fingers; 
this is the organic aspect above.

Going deeper (Extant ideas):  -


VI-3.  OUTPUT FROM COMPUTER TO USER

VI-3.1  Outputting Screen and Sounds

      Typically, the screen convertor, for example, is sent a 'start'
signal with the address in memory of a stream of bytes which should 
be downloaded; this stream of bytes is called a bitmap.  The screen 
convertor then begins downloading these from memory (via the bus). 
Once it receives this, it treats it as bit patterns that indicate the 
colours of pixels.   Once it has received all the bytes it needs it might 
send a 'finished downloading' signal back to the CPU.  It then sets all 
the pixels to the indicated colours, usually horizontally, line by line. 
Once it has finished all lines, it might send a 'finished displaying' 
signal to the CPU (and the program the CPU is running will then 
send the next 'start' signal but perhaps with a different address, to 
start downloading a different screenful.

      The sound convertor works in a similar way but instead of setting
pixels to various colours, it sends sound waveforms to the 
loudspeaker.  The stream of bytes in memory is called a sample or 
waveform, rather than a bitmap but as far as the memory is concerned 
they are the same - just a long stream of bytes set to various bit 
patterns.

      Consider the bitmap.  The bitmap consists of the appropriate
number of cells, each of which hold a bit pattern, each of which is the 
same size, for example 8 bits (one byte) for 256-colour screens, or 24 
or 32 bits for full-colour screens.  Each cell corresponds to one pixel 
on the screen.  The bit pattern in a cell indicates the colour that the 
pixel should show.  Holding the bitmap in memory takes a lot of 
memory.

 ť   screen size 1024 rows of 1280 pixels = 1310720 pxiels
 ť   each pixel cell of 32 bits (4 bytes)
 ť   total memory consumed by bitmap = 5,242,880 bytes.

      Speed of displaying the screen bitmap:  This means that, for a
1280 by 1024 screen, in which each cell occupies one byte, 1310720 
bytes that must be streamed from memory to the display convertor 
each time the CPU says 'start'.  This must happen typically 50 times 
per second.  That means that, for a screen of 1280 * 1024 resolution, 
the display convertor must download 50 times 1310720 = 65536000 
bytes (65Mb) per second from memory.  This is a not insignificant 
proportion of the maximum data speed of the bus and memory. 
While the display convertor is downloading bitmap byte streams, 
neither CPU nor other things can access the memory.  This slows 
down the CPU speed of operation.

      It gets worse.  Sometimes, each pixel colour is represented by not
8 but 32 bits (4 bytes), so that over 260 million bytes must be 
downloaded per second.  This slows the CPU down even more.  The 
sound convertor also makes a drain on memory and bus (especially 
for high quality quadraphonic sound), though usually not as high as 
the display convertor.  Typically, the CPU can be running at half 
speed because it has to share the bus and memory with these 
convertors.

      However, if you have animations running, especially video, then
you need this transfer 50 or more times a second.  If a lot of other 
processing is going on at the same time, then this slows down the 
animation and reduces its quality, or else the animation slows down 
the processing.  One begins to get noticeable delays and lack of 
smoothness, even with the fast processors of today.

      In some computers and hardware configurations, steps are taken
to reduce this bus congestion.  One way is to have separate graphics 
memory, such as graphics cards do.  The bitmaps to be displayed are 
copied from main memory into graphics memory when the screen 
must change.  In the old Amiga computer, the graphics memory is 
not separate, but is part of the main memory, but there is also a part 
of main memory to which the convertors have no access, so in that 
memory the CPU runs at full speed.  With these arrangements of 
separate or special graphics memory, the bus transmits bitmap 
streams only when the screen must change.  So, for example, a screen 
showing MSWord and the user typing at one character per second, the 
screen changes only once a second, and the change is tiny.  For full 
animations, graphics cards offer little benefit, though the shared 
memory of the Amiga does still offer a benefit.

      So, for each screen that our eyes see, there is a bitmap in
memory.  If all the bytes in bitmap have same value, what we see is a 
screen of all one colour.  What is called the background colour is 
seen when all bits in bitmap are 'off' (or 'zero').  The background 
colour might be black or white, or something else; e.g. the blank blue 
screen while waiting for Windoze (sic) to start up.

      Note that how many bits or bytes are allocated to each pixel
depends on exactly how the possible range of colours to be made 
available on screen are stored as bit patterns.

Examples:

Here are some colour resolutions that have been used:
 ť   1 bit per pixel:  ancient 'black and white' screens
 ť   1 byte (8 bits) per pixel:  258-colour screens, with actual colours
      translated from the 8-bit bit pattern via a colour lookup table
      (CLUT).  e.g. VGA.
 ť   3 bits per pixel with CLUT:  8-colour screenss (CGE)
 ť   4 bits per pixel with CLUT:  16-colour screens (EGA)
 ť   3 bytes (24 bits) per pixel:  16 million colours, the bit pattern in
      each byte being interpreted as a number (0-255) which indicates
      the amount of red, blue and green light to make up the colour of
      the pixel.
 ť   4 bytes (32 bits) per pixel:  the above plus degree of
      'transparency' per pixel, so its colour is combined with what is
      'behind' it.
 ť   The old Amiga system had a host of different versions, with
      variable number of bits per pixel, including special ones such as
      'Extra Half Brite' mode, where one bit halved the brightness of
      the colour (very good for showing shadows), and Hold-And-
      Modify (HAM) mode, which made it possible to get 4 million
      colours available while using only 1 byte per pixel.

Exercises:  -

How do I design for this?:  Use less than full-colour screens where 
possible, especially when you have diagrams rather than full-colour 
photographs.  Consider using lower resolution if appropriate.  There 
is a tendency to assume that full colour high resolutions screens are 
best, whereas in fact they are often not best, especially if you are 
showing diagrams and text rather than full photographic detail.

Going deeper (Extant ideas):  Read up on reducing bus congestion 
in graphics display.


VI-3.2  The Visual Output Channel 

The visual output channel (screen) enables us to see information 
selected from the Model in visual form.  This information is 
conveyed by means of phenomena we can see, including:

 ť   colours
 ť   shapes
 ť   spatial arrangements such as distance, length, angle
 ť   movements and other changes in the visual field.

      Since the visual channel works by setting a bitmap of pixels to
various colours, it can offer a wide range of visual phenomena, some 
purely sensory, some helping us to sense a spatial aspect, and some 
helping us to sense a kinematic aspect.

      Colours, shapes, spatial arrangements etc. are all seen by virtue
of rendering the appropriate pixel cells with appropriate bit patterns to 
set the pixel colours.  Movement and other changes are seen when 
two or more different bitmaps are downloaded by the display 
convertor, one after the other.  This gives animation or flashing or 
changing colours.

      The eye is excellent as receiving a huge amount of such
information simultaneously.  This is made possible by of the parallel 
processing that occurs in the nerves of the eye and visual cortex of the 
brain, and the tendency to detect and recognise patterns that have 
been learned.

      Learned patterns are those that we recognise without thinking,
and include, for example:

 ť   basic shapes such as short vertical line, cross, circle, etc.,
 ť   the shapes of letters and digits we learned in our early days,
 ť   the basic shape and features of the human face,
 ť   basic outline of animals, trees, etc.,
 ť   the green of vegetation, blue of sky, grey of clouds, dull colours
      of buildings, etc.,
 ť   the flickering of flames,
 ť   the three circles of red, amber, green of traffic lights,
 ť   and so on.

How the neurones of the brain remember these is called 'long temc 
memory', and can be read about in psychology textbooks.

Examples:  See above.

Exercises:  Over the course of a week, when using a large screened 
device, notice how colours, shapes, textures, distances, movement, 
etc. are all used.

How do I design for this?:  See below and later on 'Affordance'.

Going deeper (Extant ideas):


VI-3.3  Making Up The Bitmap to Show Shapes on 
Screen)

      The following will represent (part of) a bitmap, 40 by 10 pixels,
with dots showing one colour, usually background colour.

Empty bitmap: blank screen:
      ................................
      ................................
      ................................
      ................................
      ................................
      ................................
      ................................
      ................................
      ................................
      ................................
      ................................

To show things on the screen, the colour of the appropriate pixels 
must be altered, which means merely altering the bit pattern for each 
pixel.  The letter 'o' and other things will be used to show a different 
bit pattern and thus a different colour.  To actually detemcine which 
bitmap cells to change depends on the shape to be made.

Horizontal line on screen, going part way across.  To do this: merely 
alter the bit patterns of a range of bitmap cells.
      ................................
      ................................
      ooooooooooooooooooooooooo.......
      ................................
      ................................
      ................................
      ................................
      ................................
      ................................
      ................................
      ................................

Vertical line on screen, several pixels wide.  To do this: alter the bit 
pattern of three cells every 40 cells.
      .....................ooo........
      .....................ooo........
      .....................ooo........
      .....................ooo........
      .....................ooo........
      .....................ooo........
      .....................ooo........
      .....................ooo........
      .....................ooo........
      .....................ooo........
      .....................ooo........

Sloping line on screen.  To do this becomes a bit more complex: 
alter the patterns of K cells (in this case 9), then skip over M cells, 
which is the width of bitmap minus the overlap, minus K, in this case 
40-3-9 - 29 cells) and repeat the process.  K and L detemcine the 
slope of the line and to some extent make it a little thicker.
      ................................
      ................................
      ooooooooo.......................
      ......ooooooooo.................
      ............ooooooooo...........
      ..................ooooooooo.....
      ........................oooooooo
      ................................
      ................................
      ................................
      ................................


Rectangle on screen.  To do this, go to the cell that stores the top-left 
pixel and alter the bit pattern of W cells (the width of the rectangle. 
Then move on by screen width minus W and alter the patterns of 
another W cells.  Do this H times (height of rectangle).
      ................................
      ................................
      ................................
      .......ooooooooo................
      .......ooooooooo................
      .......ooooooooo................
      .......ooooooooo................
      .......ooooooooo................
      .......ooooooooo................
      ................................
      ................................


Multi-coloured rectangle on screen.  For this, do as the above, but 
each cell is altered to different bit patterns (shown below by different 
characters; B = blue, Y = yellow, I = indigo, C = cyan, R = red, 
= black, etc.)  Multi-coloured can be applied likewise to any of the 
shapes shown above and below.  Of course, adds complexity, in that 
not only must we work out which cells to modify, but also with which 
bit patterns to modify them.
      ................................
      ................................
      ................................
      .......BYYYVIIYC................
      .......BYRRVVVRG................
      .......YYYYPPPOK................
      .......YYPPPPOOO................
      .......YWWWPPPMM................
      .......WWRNMMMMM................
      ................................
      ................................

Oval or circle on screen.  To do this is more complex still:  which 
cells need their bit patterns changed must be worked out by 
trigonometry, and this takes time.
      ................................
      ................................
      ................................
      ..................oooo..........
      ...............oooooooooo.......
      ..............oooooooooooo......
      ..............oooooooooooo......
      ...............oooooooooo.......
      ..................oooo..........
      ................................
      ................................


Text on screen.  To do this, a set of small bitmaps for all the possible 
characters in the alphabet is stored separately.  There will be a 
different set for each different font or typeface that is to be made 
available.
      ..................................
      ..................................
      ..................................
      ..oooo.....ooo....ooo.............
      ......o...o...o..o...o............
      ..ooooo...o...o..ooooo............
      ..o...o....oooo..o................
      ...ooooo......o...ooo.............
      ..............o...................
      ...........ooo....................
      ..................................

Here is the font bitmap from which the above characters were copied:
 .................................................................................
 ..............oo...................oo...............oo.............o.......oo....
 ..............oo...................oo..............o..............o........oo....
 ....oooo......oooo......oooo.....oooo.....ooo......o......ooo.....o..............
 ........o.....o...o....o........o...o....o...o....ooo....o...o....oooo......o....
 ....ooooo.....o...o....o........o...o....ooooo.....o.....o...o....o..oo.....o....
 ....o...o.....o...o....o........o...o....o.........o......oooo....o..oo.....o....
 .....ooooo....oooo......oooo.....oooo.....ooo......o.........o....o..oo.....oo...
 .............................................................o...................
 ..........................................................ooo....................
 .................................................................................

      When making up text one has to decide:  should each character
be exactly the same width of pixels (fixed-width or monospaced font) 
or should narrow characters like 'i', 'l' have fewer pixels width and 
wide characters like 'm', 'w' have more (proportional font).

      Kerning:  Kerning means closing up characters in the text
according to the shape of the characters, to avoid the feeling to too 
much space.  For example, V and A can be further closed up when 
next to each other because their shapes 'fit' together.

Examples:

See above.
There are many classes of fonts.
 ť   Much text is font with serifs (small decorations at the ends of
      letters, which make them slightly easier to read as normal text,
      because the eye follows them to find the next letter).
 ť   Some text is in sans-serif fonts, which lacks these.  These are
      useful for headings and where you need to see the digits or
      characters precisely, not as text but as individual words or
      numbers.  Example:  the text on signs on British motorways,
      which is designed for clear reading at a glance.

Exercises:

 ť   Get a strong magnifying glass and look at the screen.  You will
      see the pixels, usually made up of triples of colour.  Look at the
      part of the screen that has icons of the tool bar and see how they
      are made up pixel by pixel.

 ť   Then look at text of different fonts and see how each letter is
      made up pixel by pixel.

 ť   Investigate different kinds of font (typeface).  Either look up your
      computer's font store, or enter Wikipedia.  Think where you
      might use each.

How do I design for this?:  Read material on old (pre-computer) 
approaches to design of fonts.

Going deeper (Extant ideas):

VI-3.4  Sound Ouptut

Sound output via loudspeakers is obtained by setting up something 
akin to a bitmap, except that it has one dimension, not two.  It is a 
sequence of bit patterns interpreted as numbers.  Each number 
expresses a voltage that should be sent to the loudspeaker, and the 
hardware converts the number bit pattern to voltage by a digital-to-
analog convertor.  The conversions occur at a rate of something like 
30,000 per second, and generate a waveform that has sinusoidal 
character, alternating above and below zero, such as:


The horizontal axis is time (milliseconds or so).  The vertical lines are 
the individual voltages, alongside which are the corresponding 
numbers.  In the loudspeaker these voltages are smoothed out into the 
sine-wave shown.  A sine-wave itself gives a pure musical note, and 
the time for a complete wave (zero, above, zero, below, zero) 
detemcines the pitch of the note.  The magnitude of the wave 
determines the volume.

      If you scale out of a waveform, you find what is called an
envelope of the waveform.  This refers to the overall shape it has 
over a period of seconds usually, showing its volume (loundess).

      In practice, sounds are combinations of many sine-waves, and
have a jagged look when displayed on an oscilloscope.  Waves for 
speech are particularly jagged.

Examples:

Exercises:  Get a number of sound samples and look at them with a 
waveform viewer.  See the difference in waveforms for different 
musical instruments - e.g. the pure note of a bell, versus the sound of 
a violin, or the snare drum.  Look at the differences between them 
such as shape of waveform, and shape of envelope.  Then record 
some speech and do the same.

How do I design for this?:  Record your own sample.  Then use 
them as e.g. ring tones.  Also, recorded speech is important in much 
multimedia.

Going deeper (Extant ideas):  -

VI-3.5  Output channels: strengths and weaknesses

The three output channels have various strengths and weaknesses:

 ť   The visual channel gives us colour and a very sophisticated
      sensory access to the spatial aspect because the retina of the eye
      is two-dimensionsal, and also the kinematic aspect since the eye
      is very sensitive to visual changes.  This allows great precision.
      It offers a high bandwidth (high rate of information flow) because
      our the way our eyes process information is highly parallel (a lot
      of operations at one time).  For this reason it is the main output
      channel for IS.  But its main disadvantage is that it is directional -
      you need to be looking at the UI.

 ť   The aural (audio) channel of sound is not directional: you can
      hear sounds that come from any direction.  So the aural channel
      can grab the user's attention, using distinct sounds, and different
      distinct sounds can convey meaning.  This is especially useful for
      telling the user there has been an error or some unusual event.
      Often users respond better to (short) sounds, e.g. sound bites of
      comments by satisfied customers.  Sound is especially useful
      when used in conjunction with visual channel, because it can
      convey emotion and cultural associations well, and also the
      passing of time.  Sound can even convey some spatial awareness
      around the user, making them feel more 'inside' what is going
      on, which is useful for virtual reality.  But there are several
      disadvantages of sound.  Sound is transient, present with us at the
      time we experience it, but after that it is gone.  To recall it
      requires either exercise of our memory, but for most people,
      audio is not as memorable as visual media.  One answer is to
      install a replay facility.  Files of sound samples can be large (but
      music files can be smaller since they are made up of SL
      instructions e.g. MIDI, SoundTracker).  Also, sound output is a
      nuisance in an open plan environment!  It is imprecise, especially
      when there is a lot of other sound around.

 ť   The haptic channel depends on our touching the activator.
      Recent technology: force-feedback joysticks - and the vibrator in
      mobile phones.  This channel is much more limited than the other
      two in temcs of the information that can be transmitted through
      it.  It is usually used, at present, to simulate or duplicate physical
      phenomena, maybe amplified or modified in some way.  e.g.
      Robot in hazardous environment: when it hits a wall, its
      controller feels an impact.  e.g. Virtual surgeon probing an
      orifice: the user feels the constriction of the orifice.

And, of course, for some people, like the blind or deaf, certain 
channels are not available.

      The most fruitful strategy is to use the channels together, sound
and perhaps touch providing an extra channel of information to assist 
the visual.  Studies have shown that using sound and video together 
increases learning when the sound is closely related to the visual (e.g. 
when text is read out loud), but greatly reduce learning when the 
sound is not related (e.g. in adverts on websites).  Be careful, 
therefore.

      Each channel offers a number of features that are meaningful in
the psychic aspect, and which can be used to convey information. 
Note:  that the lists below are not full and complete: you should try to 
add other factors.

Examples:  See above.

Exercises:

If you are neither visually impaired nor deaf, become aware of your 
use of each channel.  For example, notice the times when you have to 
keep looking at something on screen to see if something has 
happened, and where it would have been better to have had an audio 
signal when it happens.  Notice the characteristics of the audio 
channel listed above, and some more, compared with the visual 
channel.  If you have haptic output, also notice in what situations it is 
useful.

If you are visually impaired or deaf, notice how the channel you use 
gives you the information.

How do I design for this?:  Use appropriate channels - but also give 
options for alternatives.

Going deeper (Extant ideas):

VI-3.6  Multi-Channel Output

At the psychic level the challenge that EwT/HCI designers have is to 
make all output realistic.

      One important challenge lies in bringing sound and vision
together.  The main issue at the psychic aspect is 'lip sync':  the 
sound of speech must be accurately aligned with the movement of the 
speaker's lips if these are seen (or the sound of a hammer hitting an 
anvil must occur at exactly the same time as we see the hammer 
hitting it).  If this is misaligned by even a few milliseconds, we notice 
it and feel uncomfortable.  It is very difficult for computer systems to 
get the alignment so precise, because the screen typically refreshes no 
faster than every 20 milliseconds.

      Another is found in haptic output.  A 'dataglove' can, in
principle, make the hand that wears it feel anything.  For example to 
make the hand feel it is grasping a stick, gentle force would be 
exerted on the inner flesh of all fingers.  Several difficulties arise. 
One is that if the fingers are in fact straight, then the user will not 
believe they are grasping a stick, so the haptic output must take 
account of haptic input indicating the current position of fingers. 
Another similar difficulty is that if, say, only two fingers are curled, 
then the stick-grasping force should be applied only to those fingers 
that are curled.

Examples:

Exercises:  When watching a film or news report, notice when lip 
sync is poor, and how annoying it is.

How do I design for this?:

Going deeper (Extant ideas):  See below.


VI-4.  SOME EXPERIMENTS AND THEORY RELATED 
TO THE PSYCHIC ASPECT OF EwT/HCI

It is the field of stimulus-response psychology that has provided 
experimental results and theories to help us understand our psychic 
functioning more precisely.  It investigates our ability to detect and 
recognise patterns (visual, aural, etc.), to remember patterns, and the 
time responses we have.

VI-4.1  Fitt's Law

      One important result is Fitts' Law.  [Fitts 1954]; see also Eberts
[1994, p.175].  How fast a human being can respond?  Participant sits 
in front of blank screen, controlling a mouse or some similar device. 
At a random time, in a random place on screen, appears a shape of 
random size.  The participant must move to hit it as fast as they can. 
The time to hit correctly is measured (in milliseconds).  Time to hit is 
found to be made up of four main components:

            Time = k1 + k2 * D * log( 1 / size )

      D = distance between where participant is aiming (e.g. mouse
      cursor) and where target appears.
      Size = diameter of target.
      k1 = a constant, which is a kind of minimum time (e.g. if large
      target appears right at where the cursor currently is), due to the
      participant's speed of noticing it and pressing finger.  k1 is
      different for each person.
      k2 = a constant, different for each person, showing how much
      distance and size affects them.
      'log' refers to the logarithmic function:  log(1) = 1, log(10) =
      2, log(100) = 3, and so on.

VI-4.2  Implications for EwT/HCI

Fitt's Law can be useful in designing or evaluating very dynamic, 
fast-moving interfaces, such as in computer games or simulations. 
For example, suppose, in a battle-terrain type of computer game, you 
are flying low in an aircraft over a mountainous terrain.  As you 
come up over a ridge, you have to spot enemy locations and fire at 
them before they fire at you.  Fitt's law would tell the game designer 
that if the target is large and near where you gun is already aimed, 
then player will be able to aim faster and more easily, but if the target 
is small and not where player is currently aiming, it will take longer. 
So, in earlier, easy levels of the game, it is sensible for the computer 
to present large targets near the present aim, but on later, harder 
levels, to present smaller targets always where you are not aiming.  A 
good game is challenging but not impossible, so the game designer 
needs to know how large the target needs to be and where to place it. 
Fitt's Law can help her/him work this out.

      This can be combined with effects that are understood under the
analytic aspect, such as the number of things a person can be aware of 
at ones: the more enemies that appear, the more difficult the level. 
See below.

VI-5.  ANTICIPATING THE ANALYTIC ASPECT

What the user experiences in their seeing, hearing, pushing, touching, 
etc. usually carries some meaning as types of data, which is how the 
EwT/HCI and UI are viewed at the analytic aspect.  As discussed 
later, each basic kind of psychical phenomenon 'affords' different 
kind of data; for example length (e.g. of bar in bar chart) can afford 
meaning of 'amount'.  In anticipation of this, we will identify the 
main basic psychical phenomena, visual, aural, haptic and input.

VI-5.1  Anticipating Analytic Aspect Visually

      The visual output channel (screen) can show a variety of visual
phenomena that can carry information.  Here we list them, so as to 
reference them later.  They fall into three main groups.

# Purely sensory visual phenomena; to do with sight, colour:

 ť   hue (red, brown, green, etc.)
 ť   saturation (how much white is mixed with the colour, e.g. pink is
      red with some white)
 ť   brightness (also called value)
 ť   texture (patterns of colour, such as grids)
 ť   background and overall colour scheme
 ť   what colours are available (called the 'palette')

# Visual phenomena serving the spatial aspect: the eye sees whole 
shapes, but which are made up of groups of pixels all of the same or a 
related colour.  For example, if the same 100 pixels in each of the 
first 50 rows emit red light, then the user sees a red rectangle.  With 
this in mind, the arrangement of pixels of certain colours can make 
the user see the following:

 ť   shapes of any type, some of which we might recognise (such as
      the shapes of the letters of alphabet, i.e. a font)
 ť   size or length of shapes
 ť   position of shapes
 ť   distance between shapes
 ť   orientation of shapes
 ť   angle subtended by shapes, especially lines
 ť   spatial alignment of shapes, e.g. vertically above each other
 ť   shapes touching or connecting with each other
 ť   spatial patterns like surrounding or overlapping
 ť   perspective (smaller shapes seeming more distant)
 ť   background
 ť   multi-colour scenes like photographic images

# Visual phenomena serving the kinematic aspect: the computer alters 
the colours of pixels in a precisely timed way, that gives the 
impression of movement or change.

 ť   Movement of objects across the screen
       ť   Speed
       ť   Direction
 ť   Relative motion
 ť   Changes in size
 ť   Flowing motion, field motion
 ť   Colour changes, flashings
 ť   Morphing from one shape to another

VI-5.2  Anticipating Analytic Aspect Aurally

      The aural channel works by setting up waveforms in the
computer's memory, which are sent to the loudspeakers by means of 
specially designed electronics.  Each sound starts, is sustained for a 
time, then fades away.  In musical instruments the start is called the 
'attack', and the fading, 'decay.

 ť   Sounds in general, of which we have the following qualities:
 ť   Volume of sound
 ť   Pitch of sound (high, low; which note is played in music)
 ť   Qualify of sound, such as whether pure or fuzzy; vibrato
 ť   Rates of attach (build up) and decay (fading)
 ť   For music, or any combinations of sounds in sequence, we have
      in addition to the above:
       ť   Chords, discord
       ť   Melody
       ť   Rhythm
       ť   Tempo
 ť   For speech, which is a sequence of phonemes (the basic sounds
      of speech) we have the following qualities (see below):
 ť   accents
 ť   gender
 ť   emphasis
 ť   rising and falling of the pitch and volume
 ť   rhymes

VI-5.3  Anticipating Analytic Aspect Haptically

      The haptic channel works by controlling a mechanical activator in
contact with the skin.  It offers:

 ť   The feeling of resistance, e.g. as user tries to move joystick
 ť   An impulse (kick) given by computer
 ť   Vibrations of various types
 ť   'Texture' of surface (e.g. sandpaper, cloth, shiny metal)
 ť   A feel of pressure, springiness, softness, etc.

      Input channels offer the following:

 ť   Press
 ť   Click
 ť   Move
 ť   Push
 ť   Speak
 ť   Take picture or video

Each of these will be referred to later.


Copyright (c) Andrew Basden & Janice Whatley.
16 September 2008, 18 October 2008. 3 September 2009, 22 
September 2009, 25 November 2009, 20 September 2010, 14 
September 2011, 14 August 2012, 17 September 2012.