CHAPTER III. HCI - HUMAN-COMPUTER INTERACTION AND USER INTERFACE In this chapter we look at the interaction between human user and the computer or other information system as such, mostly considering the user interface (UI). We will look at each aspect in turn. 1. OUR APPROACH TO UNDERSTANDING HCI The danger in understanding our interaction with the computer (HCI and UI) is that we focus on one or two aspects to the detriment of others. For example, many websites look great (aesthetic aspect) but are so badly structured (formative aspect) that you just cannot get the information you want. Have you ever got to the a place in a shopping website where it gives you a message and you don't know what to do? To overcome this danger, we look at each and every aspect of the interaction between human and computer, the HCI. Doing so helps us recall all the various types of things that are important in successful human interaction with computers and other IT such as mobile phones, whether this be in user interfaces or multimedia. 1.1 Overview of Aspects of HCI In the Human Experience chapter we gave an overview of aspects of the UI or HCI: » Quantitative aspect: Amount and number of interactions and devices. » Spatial aspect: Spatial arrangements, location and size. » Kinematic aspect: Movement » Physical aspect: How both the UI and our bodies engage physically: forces, friction, light, vibration, etc. » Organic (biotic) aspect: How the user interface matches the our organs like eyes, ears, hands, and whether it affects our health. » Psychic aspect: Seeing colours, shapes, movement etc. on screen, hearing sounds, feeling vibration etc., controlling mouse, keyboard, etc. » Analytic aspect: Identifying that the shapes and sounds are expressing concepts, and what type they are. » Formative aspect: The structure of this information. » Lingual aspect: What the information means, its content. » Social aspect: The cultural connotations and acceptability of the information. » Economic aspect: The limited resources of the UI and HCI. » Aesthetic aspect: The design style of the UI and HCI: visual, aural and haptic and how they harmonise; 'nice' touches. » Juridical aspect: How well the UI does justice to the users or the information meaning. » Ethical aspect: The 'generosity' (or otherwise) of the UI. » Faith aspect: What is the deep motivation behind the UI? We will look at each aspect of HCI in detail in turn in this chapter. 1.2 Input and Output: Model, View, Controller From several aspects, our interaction with IT may be seen as input and output. Input is where we give information to the computer or do something that it responds to, and output is where it gives information to us or it does something to which we respond. Examples of input are where we double-click on an application, type text, operate a slider with the mouse to increase the volume of music, or thumb across our mobile phone screen to get to the next photo. Examples of output that the computer might give in response to that include: the window of the application appears, the words we type come up on screen and an indication is given on spelling error, the volume of music increases, and the next photo slides into view. It is traditional to call the devices of the computer and its software that accept the input the 'Controller', and those that provide the ouptut the 'View'. Behind these, 'inside' the computer, is the 'Model', which contains all the information received from the controller and from elsewhere, and which is expressed in the View. Input and output, controller and view, differ, and the characteristic of each suit the capabilities of both computer and human. » Input from human to computer (via the Controller) tends to be slow and simple - an information rate of a few tens of pieces of information per second. This suits the human because our ability to send messages to the computer is limited, and it suits the computer since its ability to recognise what the user wants is limited. » Output from computer to human (via the View) is fast and complex, a screenful of information (hundreds of bits of information) can be given several times a second (such a in a fast-moving game). This suits the human since we can recognise and collate information, especially via our eyes, very fast, and it suits the computer, which can display or emit information very fast. The View need not be a single window, but several. In fact,t he View need not be just a screen, but can involve also sound output (loudspeakers) and other channels. The Model is the store of information 'in' the computer, and the View shows some of this information. The Model usually contains more information than is shown in the View. The Controller modifies some of the information held in the Model. » For example, the Model might be a database of information about medical patients, and the View might be showing some of the data in one patient's record. Suppose the user issues a 'Delete' command to remove the record for the patient whose information is displayed in the View. The record is deleted from the database. Then the View is updated to show that the record has been deleted (for example, to show some information from another record, or to show a message saying "Record has been deleted"). » Suppose you are browsing a web page. You click on a hyperlink. The Controller works out that it is a request to find and display another page, and sends instructions to the Model (which happens to be elsewhere on the World Wide Web) to find that page. The page arrives, and the View is updated to show this page. Note that the Model need not be in the user's computer: it could be distant on the Internet. Problems with input - such as hitting the wrong key, or keys getting stuck, or giving the wrong command - are different from problems with output - such as misunderstanding what is on the screen, or not being able to see it properly. The difference between input and output occurs with the biotic to formative aspects. In other aspects, input and output are merged into our overall interaction. In most aspects we will discuss not only what occurs in our interaction as seen from that aspect, but also what challenges and problems there might be for user interfaces and multimedia. We will take the aspects in three groups: » Pre-lingual aspects, which support the lingual » The lingual aspect, which is the most important aspect of HCI and forms a link with ERM » The post-lingual aspects, which affect the style and success of the HCI. 2. PRE-LINGUAL ASPECTS OF HCI Now we go through the aspects of HCI in detail. We are using the aspects not so much as a categories, but rather as a way of separating out the issues that are important in HCI and UI. Many discussions of the issues in HCI focus only on certain aspects and forget others. We will cover some aspects in more detail than others. We will look at technologies and techniques in each aspect. We will also look at quality criteria in some aspects (what makes a UI or HCI good or bad in that aspect) and various kinds of error that might afflict use of computers; each kind is usually explainable in one aspect. 2.1 Number of Things to Interact With (Quantitative Aspect of HCI) This concerns amounts and counts of things. For example, how many windows are open? Some applications open several windows, to show several different things. Example: MSWord has the main document window and a window showing styles. Example: the Imagine 3D virtual reality creator has four windows, showing four views of the scene being created. So the quantitative aspect here refers to the number of windows. But when we click the mouse or press a key on the keyboard, that should usually go to only one window. So the quantitative aspect here refers to one (1). But why count things? Usually for some other reason, relating to another aspect. By itself, the quantitative aspect of HCI (e.g. counts of things in the interaction) has little meaning. Rather, the counting of things is usually a prelude to considering another aspect. For example, Miller [1955] published a paper about the number of things users can keep in mind at one time ('The Magic number 7, plus or minus 2'). It is the number of 'chunks' of things that are on screen which the user is expected to be aware of; we will meet Miller again in the analytic aspect. 2.2 Screen Layout (Spatial Aspect of HCI) The spatial aspect of HCI is particularly important on screens. For example: » The layout of the screen: where things are, and where we expect them to be, for example navigation links on a website are collected together in one place. » The shape of things on screen. Usually rectangles for pictures. But also the shape of icons can help us recognise and locate them quickly. » Spatial relationships and arrangements on screen. For example, we expect that things that line up above each other have something in common, such as in a list or table. The importance of these is related to other aspects, as will be explained later. The mouse is an excellent at functioning spatially. The mouse pointer indicates an exact position. 3D and 2D Space: Think about a virtual reality scene (e.g. in a 3D computer game) on your screen. There are two spatial aspects here: the position and shapes on the screen itself, which are all in two dimensions, and the scene, which is in three dimensions. These are both spatial aspects, but one is HCI and the other is ERM. » The two-dimensional space on screen itself is HCI. » The three-dimensional space of the virtual scene is ERM, because it is what the information is about, namely 3D space. However, in a two-dimensional game or a map, both HCI and ERM are two-dimensional - which can sometimes lead to confusion. The spatial aspect is very important in HCI, but we usually need the visual psychic channel to see it; see below. 2.3 Animation (Kinematic Aspect of HCI) The kinematic functioning at the UI is particularly evident in animation. Our visual psychic channel is particularly sensitive to movement, so we tend to notice it. So movement is often used to attract (distract!) attention. However, that is a matter for the psychic aspect below. The actual kinematic aspect is concerned with movement itself, such as: » Movement of mouse and mouse pointer. » Movement of objects across a screen (e.g. the piece of paper on the Microsoft copying facility). » Movement of our view across a landscape, e.g. as though we are flying across it. » Flowing movement, e.g. of fluids in pipes. » Non-visual movement includes: In music there is movement through the piece from beginning to end. In a document there is movement from beginning to end for the reader. Some of these will be picked up again in the psychic functioning in the UI below. The kinematic aspect is relevant to HCI in at least three ways. It can be used during visual output as animation to attract attention (such as those annoying advertisements!). It can be used as decoration, to make the visual interface more 'lively'. But more important than either of these is the use of movement to let the user know what is going on. For example in user interfaces during the past few years, windows on screen have not just appeared, but have moved into view, expanding from where the user clicked the mouse, or moving out of view, such as quickly shrinking down to a small icon. Such movement provides a subliminal information to the user about what the computer or mobile phone is doing, and this provides comfort. 2.4 Hardware Materials: Physical Functioning This is the aspect of what is often called the basic technology. It concerns, for example » materials, » electricity, » magnetism, » light, » vibration, » shocks » and the like. For example, your visual output might involve the physics of electron beams travelling through a vacuum tube to hit phosphor dots on a glass surface, causing them to emit light: the cathode ray tube used in most screens up until a few years ago. LCD screens, such as in a mobile phone, operate by other physical principles, specifically altering the orientation of complex crystals (LCD: liquid crystal display) so that they let light pass, or not. CRT and LCD have this one thing in common: all of them produce colours by means of a triple of three light-emitting dots, red, green and blue; by giving out different amounts of these three colours, almost all possible colours can be generated. Loudspeakers work by converting electrical signals into vibrations in the air, either by electro-magnetics or piezoelectric effects (applying an electric field to certain crystals causes them to shrink). Only seldom do we need to actively consider the physical aspect of HCI, because in normal circumstances, the physics works so well and reliably that we can take it for granted. Giving attention to the physical aspect is useful for at least two reasons. One is that it enables us to understand how things work, so we can perhaps understand better what is required, such as ruggedized equipment, or equipment that must work in exceptional physical conditions such as in space. The other important reason why we should be aware of the physical aspect of HCI is when things go wrong. When things go wrong, it is useful to know why they might have gone wrong, and how to prevent things going wrong in the future. Here are some examples: » Power cuts! » When your mouse ball is on a slippery surface it works unreliably. » Jam sandwiches have made your children's hands sticky just before they use the computer, so the mouse and keys end up all sticky. » When the internal mechanism of some keys or buttons is worn or springs become weak, then they don't work reliably. » Coffee spilled on the keyboard is not good for it! » Heat melts the case of your mouse or keyboard, distorting it. » If your data is stored on magnetic disk (e.g. floppy disk), then the data can be lost if a magnet get near it. » Bending a CD or DVD destroys it. » Physical shock, such as dropping your computer or mobile phone, can make it go wrong. » Overheating can make it malfunction, so do not block air vents. » Lightning strike can destroy the electronics of your computer. » Fire can burn it all. The moral of all this is: ensure you keep good backups, and take care of your equipment. 2.5 Hardware 2: Devices Matching Bodily Characteristics (Biotic-Organic Aspect of HCI) In HCI, the biotic/organic aspect is concerned with the actual hardware devices that engage with our sense organs - eyes, ears, hands etc. - regardless of what physics they employ. In Figure 1 we have: » The ears and the loudspeaker, » The eyes and the screen, » The hand or fingers and the mouse, keyboard or touch-pad, » The mouth and vocal organs and the microphone. {*** Example: Think about your mobile phone, and how well or badly it fits your hand, fingers, and the distance between ear and mouth. These issues are of the organic/biotic aspect. ***} From this aspect it matters little what material the mouse is made of; what matters is whether it fits the hand well: imagine a mouse the size of a desk: it would be unusable as a mouse! Figure 1. Computer with input and output hardware The organic/biotic aspect is also the realm of electronics (rather than electricity). Seen from this aspect the various devices work by electromotive force (EMF, measured in volts), currents (measured in amps) operating in conductors on components. Much of this is digital electronics, in which the EMFs (voltages) are limited to two values such as 2.7v and 3.3v or 0v and 5v. These represent the binary alternatives of on and off (or 1 and 0), when seen from psychic/sensitive aspect, later. But in the UI devices, there is also some analog electronics, in which a continuous range of EMFs is operative. For more on this, see books on computer electronics. Here we will look only at the larger-scale devices. So Figure 1 shows the electronics that serves these hardware devices (sound hardware to convert digital voltages into analog for the loudspeaker, display hardware to convert digital voltages into analog to drive the thousands of microscopic light-emitting devices that make up the screen, and analog-to-digital convertor for the microphone). These are linked to the central processing unit (CPU) and memory of the computer by conductors (the bus is a multiple conductor). 2.5.1 The Organic Aspect of Model, View and Controller If we see the computer and its user interface in terms of model-view- controller, then the model in this aspect is the innards of the computer, including the printed circuit boards of the main memory and central processing unit, the disks. This is shown in Figure 1. Three different channels of output (view) hardware links with three different human organs: » Visual channel, relating to our eyes » Aural channel, relating to our ears » Haptic channel, relating to our body (often our hands). The input (controller) hardware usually links with our hands and fingers, though there is also some sound input via microphones. The view (output) consists of the screen and the electronics responsible for the visual display on screen, the loudspeakers and sound electronics, and force-actuators that and the electronics that controls them. » The visual display electronics consists of the display hardware (e.g. a graphics card), which accesses some of the the main computer memory and converts the digital electric charges it finds therein into analog voltages to drive tiny light-emitting cells. These cells emit light that is either red, green or blue, and the intensity of the light is controlled by the EMF (voltage) fed to them. The more the EMF, the brighter the light emitted. As the EMF varies, so the amount of light varies. The tiny light- emitters are grouped in triples (red, green, blue), and there are typically a million such triples arranged in an array in a modern visual display, and 200,000 in a mobile phone display. » The sound electronics consists of sound hardware (e.g. a sound card) that converts some of the electric charges found in the computer's main memory into a stream of analog current that is fed through the coils of loudspeakers. This current is rapidly alternating, at frequencies usually of between 300 to 3000 times per second, and these make the coil and cone of loudspeakers vibrate at a frequency that is audible to the human ear. » The haptic force actuators press on our hands as vibration, or control movable seating (such as immersive cinema). The electronics that controls this receives varying EMF from the central processing unit, and converts this into powerful currents sent through coils operating in magnetic fields which create movement. (It is similar to loudspeakers but operating at lower frequencies.) (The purpose of these tiny triple light-emitters cannot by understood from the point of view of the organic aspect, but can only be understood from the point of view of the psychic/sensory aspect. From that aspect we note that each different combination of red, green and blue light gives us the sensory experience of seeing a different colour. Most colours apart from flesh tones can be faithfully composed in this way. This illustrates how the biotic/organic aspect anticipates later aspects.) The controller (input) consists of the input devices like mouse, keys and touch-sensitive screen, and the electronics responsible for linking these with the computer. 2.5.2 Input hardware devices We list input devices in approximate chronological order, older first (but sometimes still used). We indicate in bold text what the user does with each, but strictly each of these is of the psychic aspect. » Switches and plug boards: Early computers received their information by people setting switches or plug-boards (boards with lots of holes into which plugs were inserted). These operate by means of making contact between circuits of the computer. Suitable for manufacturing process control, e.g. measuring temperature, pressure, liquid level. Users throw switches or plug the board holes. » Paper tape reader: reads holes in punched paper tape by means of photoelectric cells. Tape can be any length. People have to punch tape ahead of it being read and feed them in to a card reader. Suitable for batch, not interactive, input. » Card reader: reads holes in punched cards, similar to paper tape, but each card is 80 columns. People have to punch the cards ahead of them being read and feed them in to a card reader. Suitable for batch, not interactive, input. » Swipe card reader: modern version of punched card reader, where the information is held not by punched holes but in magnetic strips or in bar codes that are read by lasers. The user either swupes the card or holds it for the laser to read (as in supermarket checkouts). » Keyboard: Human user presses or hits keys and these send electric pulses to computer. Suitable for interactive input. Original version (1970s Teletype) was like an electric typewriter; today's versions have much more sensitive keys. » Joystick: User pushes a stick in one of 8 directions; this causes various switches to close and send electric pulses to computer, a different pattern of pulses for each direction. Some 'analogue' joysticks also send pulses that indicate how far the stick is pushed. Most joysticks also have a couple of keyboard-like switches that can be pressed or hit. » Mouse: User moves mouse, and this sends a stream of electric pulses to computer that indicate how the mouse moves in two directions. (By this means, in its psychic functioning) the computer keeps track of where the mouse is. Like the joystick, the mouse usually also has two keyboard-like switches that can be pressed or hit. » Trackball: Like an upside-down mouse, with a ball which the user moves around in any direction; the trackball sends a stream of electric pulses to computer that indicates its movement in two directions. One advantage over the mouse is greater precision for fine movement. » Touch-pad and touch-screen: The user touches or strokes a pad (e.g. on laptop) or the screen and this detects where on pad or screen the finger was placed, and sends a stream of electric pulses to computer whose pattern indicates this position. Used almost like a mouse. » Microphone input: The user speaks, and sound waves detected by microphone are converted first to electric waveforms, which are then converted to streams of (digital) pulses that are sent to the computer. » Camera: The user points the camera at a view. Light from the view is focused by a lens on an array of light-sensitive cells. These (or the electronics connected to them) emit pulses which are sent to the computer; typically 10 million pulses per picture. » Direct connection to nerves: Tiny electrodes are inserted in nerve cells and pick up their electrical activity. When the user thinks these nerve cells might be activated, and this activation is converted to electric pulses that are sent to the computer. This type of input device is still only experimental. Thus all input devices except the first switches and plug-boards send electric pulses to the computer. What the computer does with these cannot very easily be described from the point of view of the biotic (hardware) aspect, but makes sense only at the psychic aspect; see below. 2.5.3 Output Hardware Devices » Screen: An array of tiny pixels (e.g. 1280 by 1024) that emit light of various colours. This can be: phosphorescent dots in cathode ray tube, light-conductive diodes (LCD) that filter light to various colours, or plasma emitters. This gives output for the visual channel. » Loudspeaker: A strong, light paper or plastic cone that vibrates at frequencies of up to 20,000 times a second to cause air vibrations that impact on our ears. The cone is vibrated by electric alternating currents running through a small coil suspended in a strong magnetic field. These alternating currents are created, via an electronic device known as a digital-to- analogue convertor, from electric pulses sent from the computer. Typically the computer sends 0.5 million pulses per second to each speaker (16 pulses 30,000 times a second). This gives output for the aural channel. » Force output device: The force-feedback joystick not only sends pulses to computer, but also has tiny powerful electric magnets attached that can be activated by the computer to provide force on the user's hand holding the stick. The force can be steady or vibratory or an impact. The electric currents that cause the force are converted from streams of electric pulses sent from the computer. Another form of force output device is the 'dataglove'. This gloves has a number of cells that exert force at various points of the hand, in an attempt to make the hand feel as though it is touching something. This gives output for the haptic channel. » Direct connection to nerves: The computer sends electric pulses to tiny electrodes implanted in nerve cells, and this activates those nerve cells, causing the user to be aware of various things such as snatches of music or a feeling of sadness. This type of output device is still only experimental, and there are many safety features to design. The electric pulses the computer sends to these devices cannot be properly understood until we take account of the psychic aspect. 2.5.4 Challenges and Problems The kinds of challenges and problems explainable at this level are those to do with hardware, our fingers etc. and the electronics. Here are some examples: » Loose connections! Plug devices into the computer - keyboard, mouse, loudspeakers, screen, etc. - and if the connection is dirty, it won't work reliably. This is particularly important in the most demanding public multimedia, because the problems of loose connections cannot be tolerated. » Our fingers get onto the wrong keys, and so we mis-key. » The computer's main memory has a maximum frequency (seed) at which it can deliver or receive and store the electric charges by which it works. They are delivered via a special set of conductors called a bus, or 'direct memory access'. Both central processor, display hardware and sound hardware must share this bus. Sometimes up to half the available frequency is used by by the display and sound hardware, leaving only half for the central processor unit - this slows the CPU down by a factor of two, or even more. This is particularly important in high quality multimedia that has a lot of high-definition animation and sound. 2.6 Seeing, Hearing, Feeling, Moving (Psychic Aspect of HCI) (In old versions of this lecture material, this was called the bit level.) It is with our psychic functioning that we see, hear, feel and respond with motor control: it is sometimes called the sensitive or sensorimotor aspect. It is the aspect of the functioning that humans share with animals, of raw, uninterpreted sensations, but it is the aspect of pattern detection and pattern recognition. It is the aspect of behavioural psychology, of stimulus and response. It is with the psychic aspect that the issues of HCI first become many and varied. This version of the section consists mainly of lists, which will be referred to in the lectures. In this aspect we look at our interaction with the computer in terms of things that our sensory capability finds meaningful: such as colour, sound, shape and impulses without beginning to interpret them. Thinking about the UI and HCI in this way helps us understand some basic psychological facts such as how easily recognised things are, how to attract attention, and how quickly the user may be expected to respond to things. 2.6.1 The Psychic Aspect of Input/output Devices From the point of view of the psychic aspect we no longer speak of the hardware in mechanical, electronic or organic terms; we speak about signals and sensory phenomena. For each such signal (psychic aspect) we can use a variety of hardware devices (organic aspect), and each of those can work by different physical laws (physical aspect). For example, to send a signal about the position the user's hand or finger is at (e.g. to control the screen pointer), we can use devices like mouse, touch-pad or touch-screen or trackball. Figure 2 shows the signals that move around the computer in both input and output. You might notice that it is very similar to Figure 1, but there are subtle differences because, whereas Figure 1 interpreted it from the organic aspect, Figure 2 interprets it from the psychic / sensory aspect. The main differences include: » Instead of naming human organs like eyes, ears, we name human sensory-motor activity like seeing and hearing. » Instead of naming 'hardware', we name them 'convertor'. » Instead of EMF, we have signals. » The links between CPU, memory and UI devices are no longer conductors with varying EMFs, but are (bi-)directional channels, along which signals and byte streams are sent. So these channels have arrows which show in which direction(s) signals are sent. For example, mouse clicks are sent from mouse to CPU. Some links have signals in both directions (see below). » The bus now transmits bulk streams of bytes to and from memory. Figure 2. Computer with input and output at sensory aspect 2.6.2 The Output Devices: Screen and Loudspeakers Typically, the screen convertor, for example, is sent a 'start' signal with the address in memory of a stream of bytes which should be downloaded; this stream of bytes is called a bitmap. The screen convertor then begins downloading these from memory (via the bus). Once it receives this, it treats it as bit patterns that indicate the colours of pixels. Once it has received all the bytes it needs it might send a 'finished downloading' signal back to the CPU. It then sets all the pixels to the indicated colours, usually horizontally, line by line. Once it has finished all lines, it might send a 'finished displaying' signal to the CPU (and the program the CPU is running will then send the next 'start' signal but perhaps with a different address, to start downloading a different screenful. The sound convertor works in a similar way but instead of setting pixels to various colours, it sends sound waveforms to the loudspeaker. The stream of bytes in memory is called a sample or waveform, rather than a bitmap but as far as the memory is concerned they are the same - just a long stream of bytes set to various bit patterns. Consider the bitmap, for example. Typically, the screen will have 600 rows each of 800 pixels (that is 480000), or 1024 rows each of 1280 pixels (1310720), or some other number. The bitmap consists of the appropriate number of cells, each of which hold a bit pattern, each of which is the same size, for example 8 bits (one byte). Each cell corresponds to one pixel on the screen. The bit pattern in a cell indicates the colour that the pixel should show. So, for example, the bitmap for a 1280 by 1024 screen would consist of 1310720 cells. This is the number of bit patterns of which the bitmap consists. This means that, for a 1280 by 1024 screen, in which each cell occupies one byte, 1310720 bytes that must be streamed from memory to the display convertor each time the CPU says 'start'. This must happen typically 50 times per second. That means that, for a screen of 1280 * 1024 resolution, the display convertor must download 50 times 1310720 = 65536000 bytes (65Mb) per second from memory. This is a not insignificant proportion of the maximum data speed of the bus and memory. While the display convertor is downloading bitmap byte streams, neither CPU nor other things can access the memory. This slows down the CPU speed of operation. It gets worse. Sometimes, each pixel colour is represented by not 8 but 32 bits (4 bytes), so that over 260 million bytes must be downloaded per second. This slows the CPU down even more. The sound convertor also makes a drain on memory and bus (especially for high quality quadraphonic sound), though usually not as high as the display convertor. Typically, the CPU can be running at half speed because it has to share the bus and memory with these convertors. In some computers and hardware configurations, steps are taken to reduce this bus congestion. One way is to have separate graphics memory, such as graphics cards do. The bitmaps to be displayed are copied from main memory into graphics memory when the screen must change. In the old Amiga computer, the graphics memory is not separate, but is part of the main memory, but there is also a part of main memory to which the convertors have no access, so in that memory the CPU runs at full speed. With these arrangements of separate or special graphics memory, the bus transmits bitmap streams only when the screen must change. So, for example, a screen showing MSWord and the user typing at one character per second, the screen changes only once a second, and the change is tiny. But during fast animations, the screen changes up to 50 times a second. In such cases, graphics cards offer little benefit (though the shared memory of the Amiga does still offer a benefit). 2.6.3 Rendering Each bit pattern indicates the colour the corresponding pixel should show. Suppose, for a certain one-byte-per-pixel 600 by 800 convertor, bit pattern 00101100 means white, 01100100 means red and 00001111 means black (I selected these bit patterns at random). Then: » If the whole bitmap held the pattern 00101100 repeated 480,000 times the entire screen would show white. » If the whole bitmap held the pattern 00001111 repeated 480,000 times, the entire screen would look black. » If the first 120,000 bitmap cells had 00101100 repeated, the next 120,000 cells had 01100100 repeated, and the remaining 240,000 cells had 00001111 repeated, then the screen would show a horizontal bar of white, a bar of red and then a wider bar of black. » If most of the 480,000 cells held 00001111 but some in the middle held 01100100 then we would see a black screen with some red somewhere in the middle. By calculating which cells should hold 01100100, we can ensure that the red in the middle of the screen shows shapes that we wish, such as lines, rectangles, circles, spirals, or other complex shapes. That is the principle of rendering: Calculate which cells of the bitmap should hold which bit patterns. It is the CPU that sets the bit patterns into the memory that holds the bitmap, and the program which the CPU obeys that determines which cells are set to which bitmaps. Sometimes a pixel requires 4 bytes and at other times only 1 byte. This is because there are two main ways to indicate byte colours by means of a bit pattern. » Direct RGB. The colour of the pixel is determined by how much red, green and blue light is emitted. For example, yellow is seen by us if equal amounts of red and green are emitted with no blue. White is seen by us when there are equal, and high, amounts of R, G and B. In the direct RGB mode, the bit pattern contains three numbers indicating amounts of R, G and B which the pixel has to show. Usually this is 8 bits (1 byte) for each colour component (with a fourth byte for 'transparency', which we do not discuss here). 8 bits allows 256 different levels of each colour component, from none (0) to full (255). Note that with CLUT, the display convertor must be sent a table of colours before any display can occur; this table of colours is called a Palette. » Colour-lookup table (CLUT). The colour lookup table holds RGB triples for a limited selection of colours (typically 256 colours, which is more than enough for most purposes). This triple is three numbers that determine the amount of each of red, green and blue that should be displayed for this selected colour. The bit pattern held in the bitmap for each pixel is then used as an index into this table. The Display convertor uses this index to look up the particular RGB triple, retrieves the amounts of R,G and B and sets the appropriate pixel to these. In this way, each pixel requires only one byte in the bitmap, reducing memory and bus loading by a factor of four. CLUT limits the number of colours that can occur on a screen to the number of entries in the table. Usually this is 256, which is more than enough for most purposes, such as the desktop and web browsing, though other numbers are available; the Amiga allowed any number of colours from 2 to 256 and even a mix of these. For high quality photographic images, however, 256 is not enough. Direct RGB allows 16 million different colours because, in principle, each pixel could emit any colour. One might think that with modern fast electronics there is no need for CLUT, and that all should be done via direct RGB. This is not true, because CLUT has certain important properties: » Good for diagrams rather than photographs. In diagrams, often the exact colours do not matter, what is important is that they should be easily distinguished. With direct RGB it is often all too easy to get blurring of colours. » Good for visually impairments. Suppose you are red-green colour blind. So in a diagram that has some red and some green, you can reset the palette so that the red entries are purple instead, and you can then distinguish them better. » Good for changing colours. By changing the RGB triple in one entry in the CLUT, you immediately change the colour of all pixels of that entry. This is useful in certain types of animation of diagrams, such as colour cycling animation. Under direct RGB, you would have to find each and every pixel and change its colour separately, which is much slower. We return to the visual output channel later. In the mean time, we must consider input devices. 2.6.3 Input Devices The data flow from input devices is much slower, and can often be transferred without the bus direct to the CPU (or some similar processor) as signals. The mouse, for example, sends signals about how it is moving, at the rate of the order of a hundred bits or bytes per second. The touch pad is similar. The keyboard is even slower. Input from the microphone is somewhat faster, typically at between 10,000 to 50,000 bytes per second - which is still far slower than the ouptput convertors. Here are a number of devices and what the user does to interact with them (organic aspect) followed by the type of signal (whether it is a single bit, a byte, several bytes, etc. (psychic aspect) and then, anticipating the analytic aspect, what the signals encode is different for each device: Device Signal type Input Devices: Switches Signal of one bit is sent every time a switch is thrown Paper tape As each row on tape passes the reading heads, a bit pattern, one byte in size, is sent to CPU: byte stream Cards As each card is read, a set number of bit patterns (typically 80 or 132) is sent to the CPU. Keyboard When a key is pressed, a single bit pattern, usually a single byte, is sent to the CPU, a different pattern for each key. Joystick When the joystick is pushed in any of 8 different directions, a signal indicating the direction is sent to the CPU. Mouse As mouse is moved, pairs of bit patterns are sent to the CPU, one indicating the amount of movement left and right, and one indicating the amount of movement to and away. When a mouse button is pressed, a one-bit signal is also sent. Trackball As mouse. Touch pad As the finger moves across the touch pad, pairs of bit patterns are sent to the CPU as for the mouse and trackball. Microphone Bytes are sent regularly (typically 20,000 per second), each byte containing a bit pattern that indicates the amplitude of the waveform each 20,000th of a second (or however fast). 20,000 is called the sampling speed. Camera An entire bitmap, similar to output screens, is sent to the computer, usually direct to memory rather than via the CPU. (Direct nerve connections not discussed here.) 2.6.2 Input Device Signals The signals from the input devices can be used to input information to the computer. So, for example, the pairs of bit patterns from a mouse are accumulated continuously by the operating system, to give a continuous indication of where the mouse is. As each pair is received, the View is changed so that the mouse cursor is seen in a different position. In this way, the mouse cursor follows the movement of the mouse. Likewise, whenever a bit pattern is received from the keyboard, it is converted to the bit pattern for the character that is the key (when the K key is pressed, for example, the resulting bit pattern is usually either that for 'k' or for 'K' in ANSI code or Unicode). Then the appropriate character is added to the View screen bitmap, and this gets displayed. Not only do the bit patterns and signals themselves produce effects. So can the following: » Time between events (e.g. this differentiates click and drag) » Order in which signals occur (e.g. left mouse button (LMB) down, mouse move, LMB up is a drag, but LMB down, LMB up, mouse move is a click followed by an arbitrary movement) In addition, the controller keeps a record of current mouse position and also a list (history) of recent positions with their time stamps. In this way, it can analyse various complex gestures. This is important in pen-driven computers, where the user 'writes' letters on the screen and these are recognised. Issues that concern such actions include: » Max time (down-up) for click » Max time (up-down) for double-click » How often mouse coordinates arrive » How often key repeats if held down » Visible and audible feedback of each action, e.g.: » Mouse cursor moves with mouse » Audible click on key presses » Complex interactions between these 2.6.3 Output channels: strengths and weaknesses The three output channels have various strengths and weaknesses: » The visual channel gives us colour and a very sophisticated sensory access to the spatial aspect because the retina of the eye is two-dimensionsal, and also the kinematic aspect since the eye is very sensitive to visual changes. This allows great precision. It offers a high bandwidth (high rate of information flow) because our the way our eyes process information is highly parallel (a lot of operations at one time). For this reason it is the main output channel for IS. But its main disadvantage is that it is directional - you need to be looking at the UI. » The aural (audio) channel of sound is not directional: you can hear sounds that come from any direction. So the aural channel can grab the user's attention, using distinct sounds, and different distinct sounds can convey meaning. This is especially useful for telling the user there has been an error or some unusual event. Often users respond better to (short) sounds, e.g. sound bites of comments by satisfied customers. Sound is especially useful when used in conjunction with visual channel, because it can convey emotion and cultural associations well, and also the passing of time. Sound can even convey some spatial awareness around the user, making them feel more 'inside' what is going on, which is useful for virtual reality. But there are several disadvantages of sound. Sound is transient, present with us at the time we experience it, but after that it is gone. To recall it requires either exercise of our memory, but for most people, audio is not as memorable as visual media. One answer is to install a replay facility. Files of sound samples can be large (but music files can be smaller since they are made up of SL instructions e.g. MIDI, SoundTracker). Also, sound output is a nuisance in an open plan environment! It is imprecise, especially when there is a lot of other sound around. » The haptic channel depends on our touching the activator. Recent technology: force-feedback joysticks - and the vibrator in mobile phones. This channel is much more limited than the other two in terms of the information that can be transmitted through it. It is usually used, at present, to simulate or duplicate physical phenomena, maybe amplified or modified in some way. e.g. Robot in hazardous environment: when it hits a wall, its controller feels an impact. e.g. Virtual surgeon probing an orifice: the user feels the constriction of the orifice. And, of course, for some people, like the blind or deaf, certain channels are not available. The most fruitful strategy is to use the channels together, sound and perhaps touch providing an extra channel of information to assist the visual. Studies have shown that using sound and video together increases learning when the sound is closely related to the visual (e.g. when text is read out loud), but greatly reduce learning when the sound is not related (e.g. in adverts on websites). Be careful, therefore. Each channel offers a number of features that are meaningful in the psychic aspect, and which can be used to convey information. Note: that the lists below are not full and complete: you should try to add other factors. 2.6.4 The visual output channel The visual output channel (screen) enables us to see information selected from the Model in visual form. This information is conveyed by means of phenomena we can see, including: » colours » shapes » spatial arrangements such as distance, length, angle » movements and other changes in the visual field. Since the visual channel works by setting a bitmap of pixels to various colours, it can offer a wide range of visual phenomena, some purely sensory, some helping us to sense a spatial aspect, and some helping us to sense a kinematic aspect. Colours, shapes, spatial arrangements etc. are all seen by virtue of rendering the appropriate pixel cells with appropriate bit patterns to set the pixel colours. Movement and other changes are seen when two or more different bitmaps are downloaded by the display convertor, one after the other. This gives animation or flashing or changing colours. The eye is excellent as receiving a huge amount of such information simultaneously. This is made possible by of the parallel processing that occurs in the nerves of the eye and visual cortex of the brain, and the tendency to detect and recognise patterns that have been learned. Learned patterns are those that we recognise without thinking, and include, for example: » basic shapes such as short vertical line, cross, circle, etc., » the shapes of letters and digits we learned in our early days, » the basic shape and features of the human face, » basic outline of animals, trees, etc., » the green of vegetation, blue of sky, grey of clouds, dull colours of buildings, etc., » the flickering of flames, » the three circles of red, amber, green of traffic lights, » and so on. How the neurones of the brain remember these is called 'long term memory', and can be read about in psychology textbooks. 2.6.6 Basic Visual Phenomena The visual output channel (screen) can show a variety of visual phenomena that can carry information. Here we list them, so as to reference them later. They fall into three main groups. # Purely sensory visual phenomena; to do with sight, colour: » hue (red, brown, green, etc.) » saturation (how much white is mixed with the colour, e.g. pink is red with some white) » brightness (also called value) » texture (patterns of colour, such as grids) » colour shading » background and overall colour scheme » what colours are available (called the 'palette') # Visual phenomena serving the spatial aspect: the eye sees whole shapes, but which are made up of groups of pixels all of the same or a related colour. For example, if the same 100 pixels in each of the first 50 rows emit red light, then the user sees a red rectangle. With this in mind, the arrangement of pixels of certain colours can make the user see the following: » shapes of any type, some of which we might recognise (such as the shapes of the letters of alphabet, i.e. a font) » size or length of shapes » position of shapes » distance between shapes » orientation of shapes » angle subtended by shapes, especially lines » spatial alignment of shapes, e.g. vertically above each other » shapes touching or connecting with each other » spatial patterns like surrounding or overlapping » perspective (smaller shapes seeming more distant) » background » multi-colour scenes like photographic images # Visual phenomena serving the kinematic aspect: the computer alters the colours of pixels in a precisely timed way, that gives the impression of movement or change. » Movement of objects across the screen » Speed » Direction » Relative motion » Changes in size » Flowing motion, field motion » Colour changes, flashings » Morphing from one shape to another 2.6.5 The aural output channel The aural channel works by setting up waveforms in the computer's memory, which are sent to the loudspeakers by means of specially designed electronics. Each sound starts, is sustained for a time, then fades away. In musical instruments the start is called the 'attack', and the fading, 'decay. » Sounds in general, of which we have the following qualities: » Volume of sound » Pitch of sound (high, low; which note is played in music) » Qualify of sound, such as whether pure or fuzzy; vibrato » Rates of attach (build up) and decay (fading) » For music, or any combinations of sounds in sequence, we have in addition to the above: » Chords, discord » Melody » Rhythm » Tempo » For speech, which is a sequence of phonemes (the basic sounds of speech) we have the following qualities (see below): » accents » gender » emphasis » rising and falling of the pitch and volume » rhymes 2.6.6 The haptic output channel The haptic channel works by controlling a mechanical activator in contact with the skin. It offers: » The feeling of resistance, e.g. as user tries to move joystick » An impulse (kick) given by computer » Vibrations of various types » 'Texture' of surface (e.g. sandpaper, cloth, shiny metal) » A feel of pressure, springiness, softness, etc. 2.6.7 Making up the View At the psychic level the challenge that HCI designers have is to make all output realistic. One important challenge lies in bringing sound and vision together. The main issue at the psychic aspect is 'lip sync': the sound of speech must be accurately aligned with the movement of the speaker's lips if these are seen (or the sound of a hammer hitting an anvil must occur at exactly the same time as we see the hammer hitting it). If this is misaligned by even a few milliseconds, we notice it and feel uncomfortable. It is very difficult for computer systems to get the alignment so precise, because the screen typically refreshes no faster than every 20 milliseconds. Another is found in haptic output. A 'dataglove' can, in principle, make the hand that wears it feel anything. For example to make the hand feel it is grasping a stick, gentle force would be exerted on the inner flesh of all fingers. Several difficulties arise. One is that if the fingers are in fact straight, then the user will not believe they are grasping a stick, so the haptic output must take account of haptic input indicating the current position of fingers. Another similar difficulty is that if, say, only two fingers are curled, then the stick-grasping force should be applied only to those fingers that are curled. 2.6.8 Some Experiments and Theory Related to the Psychic Aspect of HCI It is the field of stimulus-response psychology that has provided experimental results and theories to help us understand our psychic functioning more precisely. It investigates our ability to detect and recognise patterns (visual, aural, etc.), to remember patterns, and the time responses we have. One important result is Fitts' Law. [Fitts 1954]; see also Eberts [1994, p.175]. How fast a human being can respond? Participant sits in front of blank screen, controlling a mouse or some similar device. At a random time, in a random place on screen, appears a shape of random size. The participant must move to hit it as fast as they can. The time to hit correctly is measured (in milliseconds). Time to hit is found to be made up of four main components: Time = k1 + k2 * D * log( 1 / size ) D = distance between where participant is aiming (e.g. mouse cursor) and where target appears. Size = diameter of target. k1 = a constant, which is a kind of minimum time (e.g. if large target appears right at where the cursor currently is), due to the participant's speed of noticing it and pressing finger. k1 is different for each person. k2 = a constant, different for each person, showing how much distance and size affects them. 'log' refers to the logarithmic function: log(1) = 1, log(10) = 2, log(100) = 3, and so on. Implications for HCI: Can be useful in very dynamic, fast- moving interfaces, such as in computer games or simulations. For example, suppose, in a battle-terrain type of computer game, you are flying low in an aircraft over a mountainous terrain. As you come up over a ridge, you have to spot enemy locations and fire at them before they fire at you. Fitt's law would tell the game designer that if the target is large and near where you gun is already aimed, then player will be able to aim faster and more easily, but if the target is small and not where player is currently aiming, it will take longer. So, in earlier, easy levels of the game, it is sensible for the computer to present large targets near the present aim, but on later, harder levels, to present smaller targets always where you are not aiming. A good game is challenging but not impossible, so the game designer needs to know how large the target needs to be and where to place it. Fitt's Law can help her/him work this out. This can be combined with effects that are understood under the analytic aspect, such as the number of things a person can be aware of at ones: the more enemies that appear, the more difficult the level. See below. 2.7 Awareness of Important Things (Analytic Functioning at the UI) From the vantage point of the analytic aspect of HCI, we see the HCI and UI in terms of basic pieces of data, such as numbers, entities and words. We are concerned with three main issues here: » Distinction: How easily the user can distinguish what is important among what they receive psychically from what is less important; » Attention: The user's awareness of, or focus upon, important; » Conceptualisation of it as various types of data. This can be with any of the channels: visual output, sound output, haptic output, or motor input. First we look at the distinguishing of basic pieces of data by the visual and aural channels, in text and speech. Then we look at attention and focus. Then we look at the basic types of data as such and the notion of 'affordance', which recognises that certain psychic phenomena carry certain types of information better than others. 2.7.1 Text: fonts, letters and digits Fonts are basic shapes used to express letters, digits and other characters via the visual channel (printing or screen). For a font to work well, it must be very easy to distinguish what letter or other character it refers to. There are two important characteristics, especially if any of the users might be visually impaired - even slightly so. The first is that lower case it is easier to distinguish words than with upper case, because the outline-shape differs when using lower case: cat dog whereas with upper case, the outline-shape is the same, just a rectangle: CAT DOG ( The second important thing is the actual shape of the letters or digits and how easily they can be distinguished from each other. Compare the following fonts to see how easily you can distinguish various digits from 8 (the ones before the break should be more difficult in some fonts because those digits are slight modifications of 8). 8283858689808 8184878 8283858689808 8184878 8283858689808 8184878 Notice how in the third font it is slightly easier to differentiate the 9 and 6 from the 8, despite being smaller than the first, because the 9 has a descender and the 6 points upwards rather than turning down. Arial font (not shown above) is particularly bad for digits - which is a pity because it is the standard one for spreadsheets. 2.7.2 Distinctions in speech: phoneme and words Speech is composed from individual waveforms called phonemes - the very basic sounds of speech. To speak text involves at least two stages: converting text to suitable phonemes, then rendering the sequence of phonemes into the sound output buffer. Algorithms are available for the first step, which converts text into a string involving the International Phonetic Alphabet. Here is some of the IPA: IY - vowel as in beet EH - vowel as in bet IH - vowel as in bit N - consonant as in men NX - consonant as in sing EY - diphthong as in made OY - diphthong as in boil Converting text to IPA is not as easy as it sounds because it involves knowing how each word is to be pronounced - for example the word 'lead' can be pronounced 'leed' or 'ledd'; which is correct must often be worked out from the context, and often the algorithm makes the wrong choice. The challenges include: » Finding the correct consonant » Finding the correct vowel or diphthong » Adding appropriate stress to syllables » Altering intonation » Controlling the pitch and speed of speaking, and how it varies at various points in the sentence; e.g. pitch or speed is often lowered at end of a sentence and raised at the end of a question » Maximising intelligibility: for example polysyllabic words like 'enormous' are often more intelligble than monosyllabic words like 'huge'. » and much more. The second stage involves converting the phonemes to actual sound (or rather a waveform that is converted to sound via the hardware). One way is to use a method like that used for music, use samples of individual phonemes. But this has its own challenges: » Merging them together in a way that is smooth » Keeping the appropriate gaps between words » Differentiating male from female voices, and also accents » Altering the notes for stress and intonation » Raising and lowering pitch at end of sentence etc. » Raising and lowering speed. » and so on. Computer speech and music are an area where exciting advances can still be made; for example, there is very little attempt at computer singing! 2.7.3 Attention and Focus Miller [1956; see also Eberts 1994, p.169 ff] investigated how many things can we be aware of at any one time? In a paper entitled 'The Magic Number Seven, Plus or Minus Two' he found the answer: around 7, some people can only be aware of 5, others up to 9. If more things are in the visual field then we suffer a kind of information overload and just don't notice some of them. This can explain why fiddling with the radio while driving can cause accidents. Of course, we usually have many more than 9 things in our visual field, so how do we cope? Answer: 'chunking'. We group things into chunks. For example we see a group of children playing by side of road: that is one chunk. Until one of them starts running across the road, when that one becomes a chunk in its own right. But chunks are when we recognise that a number of things 'belong together' and so do not distinguish between them, but distinguish between the thing that these make up from the rest. There are several challenges for HCI, user interface and multimedia. Miller's result suggests: Don't require the user to be aware of more than five things at once; design your UI accordingly. Design your graphics presentation with a maximum of five lines1 of text on screen. Design your multimedia to present only five main visual effects. Design your web page so that there are five main visually distinct areas on the page. Group things that the user will see as of similar meaning: make it easy and natural for the user to chunk things that they should see as one group, for example by making the things in group a similar shape, size and colour and placing them next to each other. Of course, in games, these might be reversed, especially for advanced levels where the player needs to be challenged. For the earlier example of a battle-terrain computer game, where the player is flying low in an aircraft over a mountainous terrain, and they come up over a ridge, and have to spot enemy locations and fire at them before they fire at you, Miller's law would tell the games designer to ensure that at the easier levels, there is no more than five things of importance on screen (trees, buildings, enemies, etc.), but at higher levels, make sure there are more than nine. At medium levels, let there be lots of things but they are mostly of the same type, so player learns to chunk them, before progressing to higher levels. Results like Miller's introduce the issue of clutter. Clutter is where the number of things is just too large and the user cannot find any means of chunking them. That is, there is no obvious way in which the things can be meaningfully grouped together. Inter-channel interference. There have also been experiments of how different channels interfere with each other. For example, movement attracts attention. So does sound. So if a web page, for example, has an animated advertisement, it will keep attracting your attention - and get annoying. Sound can also interfere with, or it can support, what is being read. 2.7.4 Types of Raw Data and Affordance The analytic aspect conceptualises, and each concept is raw data - data that is not connected with any other data (connections are the formative aspect). What it conceptualises is highly varied - amounts, shapes, truth values, entities, relationships, structures, names, and so on - that is, raw pieces of data are of many types. At the user interface, the sensory-psychic functioning of the user usually carries some types of data. Different types of psychic functioning 'afford' different abilities to express these. Affordance is covered in Chapter VI. 2.8 Help With Achieving Your Purpose; Structure and Relationships (Formative Aspect of HCI) The formative aspect is concerned with formative power, with the shaping or construction or structure of things, with putting things together rather than just leaving them is a random pile. We look at structure of the UI, then of the data itself, then modification of data. 2.8.1 Structure of the User Interface Itself The layout on screen is not (usually) random, but is structured (formed) in a way to help the user understand the symbols on it. For example: » the middle of screen is where the main information occurs » the edges of screen is where the navigation and other general information occurs » the top of screen is where the title occurs Within the main information itself we can also find structure, for example: » tables show similarity of things down the columns and across the rows » bullet lists show a collection of things that are different in a certain way » text is structured into linear sentences that obey a certain syntax, » ... and these sentences are visually structured by wrapping the text into paragraphs » box and arrows diagrams are structured such that each arrow must start and end at an item » ... and so on. The structure of the layout is very important to helping us understand the meaningful content of the screen. In this way the formative aspect serves the lingual aspect. In designing the structure of the screen the developer considers what the user might want to achieve (thus considering the formative aspect of the HLC of the user) and tries to match the spatial structure on screen to that requirement. For example, on a web page the navigation bullets are usually kept separated from the text on right or left side, so user can see them easily and know where they are. Sometimes they are in the middle of the text, e.g. at the foot of every section; this makes it very convenient. {*** Think: Imagine what it would be like if your screen (e.g. on your mobile phone) had no structure. Try to work out *why* each piece of information is where it is. ***} Sound also has structure, though it is mainly a linear one. This is why good syntax of speech is so important, because without it we would get confused. Also music has structure. 2.8.3 Styles of User Action Style is perhaps one of the most mis-understood issues of the early Nineties, and the one that has the most hype surrounding it. Everybody nowadays is rushing to adopt WIMP, GUI, etc, and Microsoft made a killing in the late 1980s out of this tendency to jump on bandwagons. Our aim here is to get beneath the surface and understand the issues involves. There are a number of common styles of dialogue: # Commands, in which the user types in commands and supplies various parameters to guide their execution. The Command style of dialogue is the oldest in interactive computing, and perhaps the most flexible. # Menus or Toolbars, in which the commands or objects are selected from menus rather than identified by name. # Question-and-Answer, in which the computer asks a question and the user supplies the answer, repeatedly. e.g. 'Are you sure?' on deleting something. # Form filling, in which the computer puts a form up on the screen with a number of spaces which the user fills in. # Direct Manipulation (DM), in which the user selects objects (usually with the mouse) and identifies commands by graphical movement, such as drag-n-drop. # Control Panel, in which the computer supplies what looks like a control panel with knobs, etc. and the user identifies what needs to happen by hitting these with the mouse pointer. The DM style usually assumes an Object-Oriented structure, since it is based around the idea of direct operations on selected objects. In this lecture we will look at the Command and DM styles in more depth. {*** You should read chapter 13 of Preece. ***} Locus of control (refce, 19) refers to whether the user or the computer is in control of the interaction activity, that is whether the user or the computer takes the initiative for each action. In the middle we have what have been called Mixed Initiative systems. Here we examine the issues involved. 'Locus of control' is not a good term since the idea of control is also present in, for instance, safety and security of data, and that is a different concern. (That is, who has control over - or access rights to - certain data: it is important sometimes that the user does not have such access rights. Examples include confidential data, keys in data tables, and data inside the operating system. But this is NOT what we refer to when we speak of 'locus of control'.) Therefore, we speak rather of spectrum of freedom - whether the user or the system has certain types of freedom in the activity at the user interface. There are several aspects of such freedom: » Action freedom - to determine what happens next » Value freedom - in the range of data values to enter » Item freedom - in the range of types of item to attend to The various dialogue styles vary in these freedoms: Command style: Action freedom: High (as wide as the commands available) Value freedom: High (as wide as any value that can be expressed) Item freedom: High (as wide as any item that can be indicated) Menu style: Action freedom: Only whether or not to select from menu, plus any actions present on the menu itself Value freedom: Restricted by those offered on menu Item freedom: As value freedom Toolbar style: As menu. Question and answer style: Action freedom: None - must answer the question presented (though many question panels also have other action buttons for e.g. help) Value freedom: Little; as dictated by question Item freedom: Usually none; cannot select different question Form style (a set of questions to answer): Action freedom: As question and answer style Value freedom: As question and answer style Item freedom: Little: can select which value(s) to enter Direct manipulation style: Action freedom: Limited to types of action available Value freedom: As wide as continuous spatial movement Item freedom: Can select and act on any items seen. Now, when should each type of freedom be allowed to the user? In generic packages e.g. word processors and drawing packages the user should have total freedom in all three. For instance, in a word processor, s/he should have action freedom to decide whether to type, erase, split paragraphs, search, cut, paste, save, load, etc. S/he should have value freedom to determine what font size, what colour, what emphasis, etc. to give to text. S/he should have item freedom to decide to what text to add to or alter, etc. But this is not appropriate in all software. There are four main reasons for restricting the freedom of the user: a) When the user lacks knowledge of the domain or application covered by the software. e.g. In CBT and other training packages, it is usually necessary to restrict the order in which the user goes through the material (item freedom regarding which topics to learn about next). In installation scripts it is usually necessary to restrict where files can be placed (item and value freedom). b) When there is danger. In many cases if the user is allowed full freedom then things would go wrong. For instance, in an installation script, all the steps of installation must be completed, and in the correct order, so the action freedom of the user is normally severely limited to those actions necessary for installation. c) When the user requires a feeling of security. The novice user especially requires a helping hand and the feeling of security, or rather of being able to orientate themselves in 'surroundings' that are easily understandable - and one way of achieving this is to limit the number of options available in those 'surroundings'. d) When more elaborate help is needed. If the range of actions a user can take is wide, and the user asks for help then only a small amount of help can be given about each option. If more elaborate help is needed then it might be appropriate to limit the range of options available and give more help on each. 2.8.4 Challenges for UI, HCI and MM Psychology has researched people's ability to carry out tasks, and the ease with which this might be done. Eberts [1994, p.171] describes 'The Cooker Experiment'. A cooker with four burners is shown on screen, along with knobs that control them, with suitable labels. Three or more layouts of burners and knobs are tried. Participants sit in front of the screen, which is a touch-screen, and suddenly one of the burners or knobs will light up. The participant must then touch the associated knob or burner as fast as possible. Times are measured. It was found that if the knobs are laid out similarly to the burners (e.g. knobs and burners in a square, or knobs in a row with burners slightly askew) then response is much faster than the usual arrangement, with burners in a square but knobs in a line in front of them. This experiment is an example that covers two aspects. The main one, the type of task being researched, is the formative aspect of achieving things (controlling cooker plates). But important in this is that the psychic aspect of pattern-recognition can greatly assist the formation functioning. Without such direct assistance with the psychic aspect, the user has to function fully in the analytic aspect to conceptualise what is in front of them, and the formative aspect of working out what to do. Of course, the row-of-knobs can be learned, but it takes longer to learn it because there is no help from the psychic aspect. Implications for HCI: Ensure the controls for a device match the various parts of the device. 2.8.5 Anticipating the Lingual Aspect The formative aspect of HCI is to do with how the user can achievement what they want. What they want to achieve is related to the information content that is carried, which brings us to eh lingual aspect. There are, however, two ways of achieving: distal and proximal. These are two different types of relationship that the user has with the computer. Donald Norman [1990] said: "The problem with the user interface is that it is an interface. Interfaces get in the way. I don't want to focus my energies on an interface. I want to focus on the job." Distal HCI is when we have to focus on the interface. We have to be aware of the interface itself, and plan what we do. This fully involves the analytic and formative aspects. By contrast, proximal HCI is when we do not have to be aware of the interface, nor do we have to plan what to do, because we have become so used to it that we can operate it and engage with it almost without thinking about the interface. The analytic activity of awareness and the formative aspect of planning have been so well-learned that they have become tacit. Michael Polanyi [1967] discussed the difference between these. Considering the formative and analytic aspects of HCI focuses on the interface rather than on 'the job'. By 'the job', Norman meant the meaning that is represented via the symbols and their structures. As we consider the lingual aspect, we will be considering 'the job'. We will also find it links with ERM. 3. UNDERSTANDING THE CONTENT VIA THE USER INTERFACE - THE LINGUAL ASPECT OF HCI The lingual aspect of the interaction between the human and the computer concerns the 'signification' of the structured information that seen, heard, felt or input. » We see numbers arranged on a screen: what do the numbers tell us? That is their lingual aspect. » We see text arranged on the screen: what is it about? That is its lingual aspect. » We hear speech: what does it tell us? That is its lingual aspect. » In a haptic UI, we feel a kick: what does it mean? That is its lingual aspect. The lingual aspect is concerned with the meaning of information rather than its structure or what data types are used. When we consider the lingual aspect of HCI, we focus on the content rather than the technology. When we consider the lingual aspect of multimedia, we focus on 'what it says' rather than on 'what it looks like'. The lingual aspect of HCI links with ERM (Engagement with Represented Meaning). It is the very purpose of most HCI, and hence and is the qualifying, aspect of HCI. This is why it is in a separate section. 3.1 Foundational Dependency on Earlier Aspects The foundational aspects serve the lingual aspect as follows, first for output: » Organic/biotic aspect: Hardware devices that allow the user to see, hear or feel information; examples: screen, speakers, force- feedback joystick. Or even direct electrical connection to user's nerves. » Psychic aspect: The user sees shapes, colours, etc., hears sounds, feels kicks etc. which the computer generates. » Analytic aspect: The user distinguishes what is meaningful (as a symbol that carries information) from what is not, and conceptualises the meaningful ones as certain types of information, such as quantities, items, qualities, etc. » Formative aspect: User relates these pieces of information together and processes them. And for input: » Organic/biotic aspect: Hardware devices that allow the user to give information to or get information from computer » Psychic aspect: Movements of input devices; sensing of output device signals » Analytic aspect: Types of information that these signals represent » Formative aspect: What user wants to achieve in giving input. 3.2 Information, Illustration and Decoration Not everything that comes through the visual, aural or haptic channels has information content: some is purely for decoration. Decoration functions in the psychic and aesthetic aspect but not very much in the lingual aspect of content. It is useful to differentiate three main purposes for pieces of HCI: information, illustration and decoration. Text and speech are almost always for information purposes, but graphics, animation, pictures, colour schemes, other sound, music and haptic feedback can be for all three. Think especially of a graphic alongside text to see the difference between them: When used in information role, the graphic conveys information of its own. A bar chart on screen is an example, as is a spoken sentence or a warning 'beep'. The information is usually not contained elsewhere, so the graphic or sound is essential. The graphic consists almost entirely of symbols. When used in illustration role, the picture is used to illustrate what other text is saying, in order to make clearer the meaning of the text. Illustrations often take the form of examples. Usually illustrations are not essential to the material being read, but can help to support it. Most of the slides that accompany these lecture notes are illustrative. The graphic has many symbols (SL) but can have some things that are only BL, e.g. a digitized picture or a video clip which the user can interpret. When used in decoration role, the picture or sound is has very little information meaning of its own. An example of decorative graphics is found on the introductory screen of much software. A musical introduction is also decorative. Usually decorations are superfluous to the meaning of the document, and can be omitted without harm. The purpose of decoration is often to make the document aesthetically pleasing, to sell it, or to provide 'atmosphere'. The latter is especially important in games. (One could argue that atmospheric decoration actually provides information, if one stretches the definition of information, but we will not take that line here.) Decoration is entirely BL because (or when) it has no symbolic value. Of course, there is a spectrum or spread between the three. e.g. games music might give a little information about whether there are lots of enemies around. In most games, graphics and sound have important decorative roles, but decoration is perhaps less common in the stern world of business software. However, decoration can be important, in helping to set a context or provide light relief. It is becoming more important in CBT (computer-based training). Much graphics, sound and animation has at least partly a decorative role, and we can expect to see much more decoration in future. In practice, information, illustration and decoration tend to overlap, so that a given visual or sound effect will sometimes fulfil all three roles, or at least two. In this module we will focus mainly on information and illustration, because decoration often has no symbolic content. The key norm of the lingual aspect is understandability: So a good UI is one that makes it easy for the user to understand the meaning of the information. Language is important. In a text UI, the language in which the text is written is important. But languages can also be graphical; a diagram too, for example, has a 'language'. For example in a bar chart, the user needs to understand what the bars represent, what the two axes represent, why bars might be grouped together or have different textures, and so on. All lingual functioning at the UI requires the user to share the same 'language' as the UI designer. Otherwise there will be misunderstandings. 3.3 Link with ERM It is the lingual functioning of HCI that is the primary link with ERM. Because it is by this functioning that the user understands what the symbols at the UI mean, and expresses their own meaning back to the computer. This is discussed at greater length in Chapter VI, §2. 3.4 Lingual Norms (Quality Criteria) for HCI What makes a UI or human computer interaction good (or bad) from the point of view of the lingual aspect. Largely, it is the same as for authorship of a book. These include: » What it means should be understandable. » It should be truthful. » It should be timely, up to date. » It should be relevant. » It should make sense, and have a 'logic' in it; this does not refer to formal logic, but rather than what is said 'flows' well. And so on. There are other criteria, but they lead into the post- lingual aspects, as follows ... 4. POST-LINGUAL ASPECTS OF HCI AND UI The post lingual aspects serve the lingual by affecting its style and how well it functions with other people. (This is called anticipatory dependency in Dooyeweerd's philosophy. See Basden [2008, p.71] and "http://www.dooy.info/" 4.1 Making HCI Work Across All Cultures (Social Aspect of HCI) The social aspect is manifested in the effect that cultural expectations, connotations and assumptions have on the user's ability to understand what the IS is telling them. It is especially important on web pages because anyone in the world might have created it or be reading it. There are a number of issues that should be borne in mind: » Cultural connotations of words or phrases. Some are insults in one culture but are perfectly innocent in another. » Idioms. An idiom is a phrase, whose meaning cannot be derived from the meanings of its words. Imagine you are a child in a cold climate. "Were you born in a tunnel?", your mother remarks as you enter the room. You are tempted to reply, jokingly, "No, in a hospital" but you know what she means, and you turn back and shut the door. Tunnels are draughty places. But in other cultural contexts, the apparent question would not mean the same thing. In Sweden 'tunnel' is 'church'. And, you would only use this idiom in a family situation, never in a formal situation like a job interview. » Jokes and humour. Different cultures find different things funny. Avoid in-jokes on a public website. » Culture-specific words, phrases or references. For example the in-phrases among Manchester United supporters, which others might not understand. High-register (intellectual) words are also like this: words specific to intellectual culture. » Standards. Standards are rules that have been agreed among the social group should be followed. Standards exist for web accessibility, for example. When your UI is a website, it is especially important to attend to this social aspect because your readers might come from any culture in the world. Even when it is a piece of software, the same applies because your users might come from any culture. On the other hand, if you are confident that only people of a certain culture will access your site or use your software, you can capitalise on their specialised cultural expectations and assumptions, and design it to give them better service. For example, software designed for chemists probably does not need to explain what most chemists would know. 4.2 Managing Interface Resources Efficiently (Economic Aspect of HCI) The following things impose limitations on the HCI, which may be managed as resources. Hence they may be seen as the economic aspect of the HCI, though many of the limitations arise at the psychic and organic level. Each resource is of another aspect. » Screen area; this limits the number of shapes than can be placed on screen - especially on a mobile phone. Because the collections of electronics that correspond with a pixel are limited, so are the numbers of pixels on screen. So, at the bit level we speak of, for example, 1280 by 1024 pixels, which is what we call its resolution. This is a spatial resource. » Rendering speed (rendering is the process of making up the screen before it is displayed); this limits the speed at which animations can occur. Each rendering process involves calculations. Calculation is of the formative aspect, so rendering speed is a formative resource. » Bus speed. Maximum rate of computer's internal electronic bus, memory or CPU; high resolution screens with many colours and long sound samples, consumes a lot of the bus bandwidth, thus limiting the speed at which the CPU can operate to process programs. The speed at which pixels can be sent to the screen is also limited by the speed at which the electronics can change the colour-state of each cell. So there is a maximum refresh rate for each type of screen. Typically this is 50 or 75 times per second for a full screen. PAL TV works at 50 times a second. This is a physical resource. » Network speed. The speed at which two computers can communicate is also limited at the hardware level by the highest frequency at which the wired or wireless connection can operate - and this too limits the speed at which, for example, files can be downloaded from a network. This would seem physical aspect, but what is important is not the number of bits transferred per second, but the number of pieces of data. This is an analytic resource (pieces of data per unit time). » Frequency range of human ear; this limits the range of sounds that may be used. The highest frequency that the human ear can hear is (depending on age) from 5000 to 20000 Hz (cycles per second). This limits the useful frequency range for sound output. This is a psychic limitation. » Maximum information rate, of absorbing new information; this limits the speed at which visual field can change and at which sounds can be made. There are three aspects of this limit. One is psychic: the eye and ear and their nerves have limits on how fast they can work. One is analytic: we are limited in how many pieces of data we can cope with at one time: 'The magic number, plus or minus two' [Miller, 1955]. » Input channel width; Input channel width is limited. It is the number of different signals that can occur (e.g. » with digital joystick there are 8 directions plus two buttons, giving 10 different signals; » with mouse there are two buttons; this allows only 3 different signals (LMB, RMB, both together) » with keyboard there are say 80 keys, and these can be modified by qualifiers like Shift, Ctrl, which can be used in combinatinos, giving typically 640 (8*80) different signals) All of these can be employed to allow the user to signal his/her intended actions to the controller. But usually only a tiny subset of them are actually used. There is growing interest in two- handed input using mouse in one hand and keyboard or trackball in the other. » Human impatience; this means that the user will not wait many seconds for what the computer is doing - e.g. download time - unless they know of a good reason why it should take a long time. Patience is probably a pistic matter (we get more impatient if we are arrogant) or ethical-selfgiving matter (we don't want to give others time). All these are important in multimedia. Though each is a resource in a particular aspect, the fact that there is a limit is the economic aspect of HCI. {*** As you use your computer or mobile phone, try to think of other things that are limited. Which aspect are they? ***} The economic relationship between human and computer is not symmetric between input and output. The human is good at detecting and recognising visual and aural patterns, but the computer is poor at doing so. This means that the computer can generate speech and visual patterns and the user will usually know what is meant, but speech recognition by computer is hard. Similarly, the computer is good at reliably performing fast actions and doing calculations, while the human is slower, less reliable and limited at calculation. So the computer can be made to express much information at one time (e.g. on one screen) whereas the human user would take some time to express it all. Further, the human user has intentions while the computer (usually) does not. 4.3 Interesting, Enjoyable, Harmonious Interactions (Aesthetic Aspect of HCI) The aesthetic aspect of HCI refers to the enjoyment, or otherwise, which the user experiences from the interaction itself. This is different from the enjoyment they experience from engaging with the meaning or from life itself while using the IS (which are an aesthetic aspect of ERM and HLC). The aesthetic aspect of HCI covers harmony, fun, beauty, style, humour and interest. The aesthetic aspect of HCI is typically focused in the graphic and multimedia design of the interface. Here are some examples of harmony: » colour schemes that harmonise, » layout that is balanced » animations that subtly contribute to the overall effect rather than distracting attention » sounds that go well with the visual UI. Here are some examples of beauty, fun, humour, interest: » backdrop: nice-looking, humorous or interesting » the idea of using a 'paper clip' with a face to give advice » nice-looking colours » style of writing in text » text tells an interesting story But after a time, these things pall and get annoying. It it very tempting to focus on these for their own sake (especially graphic design), leading to one of those stunning user interfaces (e.g. web sites) that look good but do not give useful information. Remember, in HCI, all the aspects should so function as to serve the lingual: they should enhance the communication of information to and from the user. This is the major theme of that authority on graphic design, Edward Tufte [1990]. Beware: The HCI involves more than the main content. For example, consider a results page offered by a search engine. You have several tranches of content: » the main content (a list of articles found from the search) » helpful suggestions for what else you might try » navigation buttons or links » advertisements » administrative information like contact details. The main content might be aesthetically pleasing, but what about the overall effect? Designers and regular users of a UI tends to think about only the main content, but other users and occasional observers see the whole. So beware lest the adverts, for example, detract. In HCI that is good according to the aesthetic aspect, all these will harmonise, they will be relevance to each other. The most recent search engines try to make all this relevant to your original search - harmonising with it. Earlier ones did not, and the user would get annoyed by the disharmony between what they were searching for and the advertisements etc. What has been said about harmony among the content of a web page applies equally to meaning-content of a computer game, a database, and so on. {*** Exercise: Look for harmony in the meaning-content of a computer game (its gameplay), or a database. Look for harmony and fun in the graphics of the game. ***} Humour of the HCI does not refer to humour found in the meaning-content, but refers to what it is about the user interface itself that makes you laugh. It is, unfortunately, rare. However, I encountered an example of it in the early Amiga operating system, in that if you pressed keys in certain combinations and sequences, you were rewarded by hidden messages appearing on the screen, such as "We designed the Amiga, but Commodore ***d it up". These were removed in the next version of the operating system! More useful humour could be in the placement or shape of buttons or menus, alluding to various things that are known in the culture e.g. the Simpsons. {*** Designers of UIs: humour in the HCI might be an opportunity to make your mark! ***} But beware: in some serious applications fun is not appropriate; this is a matter under the juridical aspect, below. But where fun or humour is inappropriate, you can still design an aesthetic UI by attending to harmony and interest and style. Many of the rules of art are applicable to HCI, especially C.S. Lewis' aphorism, "In art, less is more". What this means is that the most aesthetically effective things are those that do not shout, but provide subtle effects. This is why, for example, animations or sounds that distract are bad aesthetically. 4.4 Doing Justice to Both User and Information (Juridical Aspect of HCI) The juridical aspect is concerned with 'what is due', i.e. with what is appropriate and proportional. In HCI, it is concerned with » what is due to users » what is due to the information (though that extends into ERM). The main way in which appropriateness to the user comes to the fore is when considering disabled users. For example: » UIs in which it is impossible to enlarge the font are not giving visually handicapped users their due of larger text. » For totally blind people, the UI should be able to speak its text, and describe anything else of importance. Web Accessibility Guidelines are an attempt at giving due to disabled people. Appropriateness to the information meaning-content. For example, sad news of somebody's death should not be accompanied by a flashy advertisement. Relevance is important: what is placed on screen should be relevant, and the facilities made available should be appropriate to what the users might want to do. 4.5 Generosity and Courtesy of the Interface (Ethical Aspect of the HCI) Ethicality does not mean good or bad. It means generosity and self- giving. A UI that is good in the ethical aspect is one that is generous. This issue has not been widely studied, so we cannot say much about it, except for two things: » Make facilities available to the user that will be helpful, but are beyond the basics required. » have duplicate menu entries. For example, my word processor has a menu to do with Blocks (what it calls selected text), which allows me not only to set, delete and copy blocks (the basics), but also copy a block from another document (very useful, but quite common). An 'extra' that is surprisingly useful is the facility to count the words in a block! Also, there is a facility to save a block (useful for starting a new document) and and formatting the text in a block. But in this menu there is no facility to print a block. For that facility I have to move to the Print menu. And the facility to spell-check a block is in the Spell menu. It would be nicer (more generous) if the Print Block and Spell-check Block facilities were to be in the Block menu as well as in the other menus. 4.6 The Vision Behind the Interface (Faith Aspect of HCI) The faith aspect refers to the 'vision' behind or underlying the design of the HCI and UI and the beliefs and commitments of the user and designer. The expectations and assumptions of both user and designer are the faith aspect, and also what the user or designer beliefs to be 'good' (or bad) or meaningful. These are usually taken for granted, but can deeply affect the quality of the HCI. They are usually social in nature: shared assumptions. For example: When you get used to one word processor, you find others difficult to use. For example: what are called 'holy wars' (note the faith-oriented language!) are waged between supporters of different platforms, such as Apple Mac, Linux, Amiga, and all denigrate the Windows platform; such denigration is partly of the faith aspect, partly of the ethical aspect (where it is dysfunctioning). 6. LINKS WITH OTHER IDEAS ABOUT HCI AND UI 6.1 Practical Guide to Usability In their A Practical Guide to Usability Testing, Dumas and Redish [1999,p.4] define usability as: "Usability means that the people who use the product can do so quickly and easily to accomplish their own tasks. This definition rests on four points: 1. Usability means focusing on users. 2. People use products to be productive. 3. Users are busy people trying to accomplish tasks. 4. Users decide when a product is easy to use." At first sight, there is a laudable focus on the human being, but closer examination reveals a heavy emphasis on the economic aspect: 'product', 'quickly', 'tasks', 'use products to be productive', 'busy', 'accomplish tasks'. Many elements of it would, thus, be irrelevant for many types of IS, such as games, unless one distorts the meaning of words like 'productive'. On the other hand, many websites today are designed mainly to look stunning, often at the expense of usability; are the web designers wrong, right, or what? The proposal here is that neither Dumas and Redish's emphasis on the economic nor web designers' emphasis on the aesthetic are right or wrong in themselves, because these are just two among many aspects. The shalom principle (which states that things work well when every aspect is upheld and given its due) applies to HCI, and absolutization of any aspect will jeopardise usability. 6.2 Winograd and Flores: Direct Engagement Soon after it was published, Winograd and Flores' seminal work Understanding Computers and Cognition [1986], made a profound impression on this author. It is worth reading, being very easy to read even though it is written at a philosophical level, and saying thought-provoking things in its beginning and end, though the middle of the book rather loses its way, in my opinion. The book did not so much create a new way of looking at computers in him, as undergird and express what he had already felt and believed for over ten years. Moreover, though Polanyi's 'tacit dimension' [1967] was his mainstay at the time, Winograd and Flores also helped him understand the difference between distal and proximal user interfaces (discussed earlier) before he discovered Dooyeweerd's philosophy and the aspects. This text is based on some in the author's book 'Philosophical Frameworks for Understanding Information Systems', and has a philosophical feel. It is offered here so that students can delve deeper if they wish to. It continues in Appendix 2, which discusses W+F at a philosophical level, but most of it is hopefully able to be understood without a knowledge of philosophy. Winograd and Flores (W+F) questioned the prevailing 'rationalistic' approach to computers especially found in AI and suggested an approach based on Heidegger's existentialism, phenomenology, hermeneutics and language theory, which were all types of philosophy. W+F, using Heidegger, challenged the way computers were understood in terms of the Cartesian subject-object relationship, as objects distal from, and operated upon by, humans. In place of this they offered the ideas of 'thrownness' and 'breakdowns' based on Heidegger's notion of being-in-the-world. W+F's second challenge was to the assumption that cognition is the manipulation of knowledge of an objective world, and that we can hope to construct machines that exhibit intelligent behaviour (as AI hoped to do). Instead, using Maturana's notion of autopoiesis, they argued that cognition is an emergent property of biological evolution and that interpretation arises from cognition, and that computers themselves can never be made truly intelligent. Their third challenge was to assumptions that language is constituted in symbols with literal meanings, that such symbols can be assembled into a knowledge base, and that they are used within organisations as a means of transmitting information. Instead, in accord with Searle's speech act theory, the listener actively generates meaning especially as a result of social interaction, and language is action, responsible for creating social structures, not just being used within them - this is now called the 'Language Action Perspective'. It is impossible, they argued, for computers to use language in the way humans do (even though they might process natural language). For more on this, and how Dooyeweerd's aspects fit into it, see my 2008 article with the late Heinz Klein, 'New Research Directions for Data and Knowledge Engineering: A Philosophy of Language Approach' (Data & Knowledge Engineering, 67(2008), p.260-285). W+F suggested 'A new foundation for design' of computer systems. The aim of AI, KBS and HCI should be redirected, away from an attempt to make computers 'intelligent' or to support 'rationalistic problem-solving', towards building useful systems that are "aids in coping with the complex conversational structures generated within an organization" [p.12]. They continue, "The challenge posed here for design is not simply to create tools that accurately reflect existing domains, but to provide for the creation of new domains." This, they hope, will open the way to social progress and "an openness to new ways of being" [p.13]. They outline the design of a Coordinator system to support cooperative work. W+F's work is still avidly discussed, and even inspirational, 20 years later [Weigand, 2006]. It deserves to be because it provides a framework for understanding three of the areas of research and practice (HUC, nature of computers and ISD) and touches on that of technological ecology. It is seen as a flagship of the Language-Action Perspective, which focuses on computer use in organisations and especially the use of language in changing them. 7. A Philosophical Look at HCI These aspects may be seen from either the user's or the computer's point of view as shown in Fig. 1. Figure 1. Aspects of HCI from user's and computer's point of view 7.2 The Central (Qualifying) Aspect of HCI Which is the most important aspect of HCI? Answer: they all are. That is not a very useful question. It is better to ask ... What is the main purpose of interacting with a computer? Which is the central aspect of HCI? We want an answer that is valid, whatever the application; in this way, the answer does not depend on ERM (what the information is about) or HLC (how our lives are affected by it). Answer: The main purpose of interacting with a computer is usually to gain and give information which the user can understand. So the central aspect of HCI must be the lingual. (NOTE: Identifying the central aspect is done by thinking about what its main purpose or meaningfulness is. Dooyeweerd called it the qualifying aspect. Identifying the central, or qualifying, aspect simplifies a complex picture. The qualifying aspect is the one that is most important in giving a thing its meaning, it destiny in life, and by which we should judge whether it is good or bad at being that type of thing. For example, a law court is qualified by the juridical aspect, a business is qualified by the economic aspect, a pen is qualified by the lingual aspect, and so on. {*** If you want to find out more about qualifying aspects, see 'Philosophical Frameworks for Understanding Information Systems' [Basden, 2008], pp. 86, 132 ff. ***}) The qualifying aspect of HCI is usually the lingual for all information systems. This is because, regardless of application, the main thing we experience is symbols on the screen (or heard from speakers) that signify something, and in the actions we make that signify what we want the computer to do. Note that this does not just mean text, but can be any channel; see below. The meaning is, of course, that which is represented in the computer and if the HCI is of high quality then the user engages with this represented meaning. As will be discussed at greater depth in Chapter VI, the lingual aspect is the main link between HCI and ERM (Engaging with Represented Meaning) - which is discussed in a separate chapter. The lingual aspect 'reaches out' to all aspects of the meaningful content, to represent all types of meaning. So, in most cases of computer use, the lingual aspect is the most important aspect of HCI. Our functioning in all the other aspects of HCI is mainly to serve the lingual functioning so that it is effective in expressing and interpreting meaning. (Very occasionally, the lingual is not the most important, but these are rare. One example would be computer-controlled disco lighting; here the human's interaction with the computer is primarily psychic, and contains no symbolic meaning. But we will ignore such specialised applications here.) 7.3 Other Aspects Serve the Lingual The lingual aspect of HCI (or indeed of anything) cannot work well without all the other aspects, especially those that are its nearest neighbours. The pre-lingual serve the lingual as follows: » The importance of the formative aspect in HCI lies in how it helps structure the information presented to the user. Think what it would be like if the information on screen was in random places, with no structure. » The importance of the analytic aspect in HCI lies in ensuring clarity in the information presented. Think of what it would be like if the information on screen was unclear. » The importance of the psychic aspect lies in ensuring that the user can see or hear what is presented. Think of what it would be like if text had the same colour as background! The post-lingual aspects serve the lingual as follows: » The importance of the social aspect of HCI lies not in the social intercourse that occurs when driving the computer (such as children gathering round a games player, which is HLC), but in whether the user understands the cultural connotations of, or assumptions behind, what is shown on the screen (or heard through the speakers), and with the standardisation of things like user interface style. » The importance of the economic aspect of HCI lies not in the cost of the building that Elsie calculated (which is ERM) but in such things as the effect of limited screen area: only a certain amount of information is visible. » The aesthetic aspect of HCI concerns how the harmony and artistic style of the UI helps users properly understand what the UI is presenting. » The juridical aspect concerns whether the UI does justice to the represented meaning, and so on. Dooyeweerd called this inter-aspect dependency, and it goes in two directions: foundational and anticipatory. The lingual aspect thus depends foundationally on the aspects earlier than it, especially the formative, the analytic, and psychic, and it anticipates the later aspects, especially the social, economic, aesthetic and juridical. Most of the aspects of HCI serve the lingual function of understanding what is presented via the UI and responding. 7.4 What is Good and Bad in HCI A norm is what is good, to be aimed for. Usability and ease of use, for example, is usually good and a thing to aim for. But what exactly is ease of use, and how can we evaluate it or design for it? It is now acknowledged to cover many factors, which can be understood multi- aspectually. Table 1 lists several normative factors under each aspect. Table 7. Aspects of usability These are some of the things by which we could judge the UI or HCI. But in each aspect you will find more if you need to. And you could add the ethical and pistic aspects if you wish. Sometimes there might seem to be conflict between such aspectual norms. For example, the juridical norm of appropriateness can make it difficult to standardise the style of UI [Basden, Brown, Tetlow and Hibberd, 1996]. One way to resolve this is to take into account the qualifying aspect of HCI. If, as suggested earlier, this is the lingual aspect, then its norms of conveying information, understandability and truth-telling should always be honoured. However, the lingual norms should not themselves be absolutized, because HCI only gains its meaning by referring beyond itself to ERM and HLC. Note: Sometimes it is appropriate to break the rules. Especially in computer games or other fun software. For example, the rule that all important symbols should be clearly seen (psychic, analytic aspects) is reversed in games, where the best weapons or equipment are hidden and difficult to see. 7.3 Aspects as Checklist: Guidelines for UI While it is appropriate on occasion to focus attention on one aspect (usually the qualifying) we should always do so in a way that gives all the other aspects their due. If we over-emphasise an aspect we begin to ignore other aspects, and the result is that the success or fruitfulness of our activity is jeopardised. Thus, for example, a web page that has superb graphics but is otherwise devoid of useful content it will fall into disuse. Web pages are user interfaces, and we can see the normativity of many of the aspects recognised in the more mature published web design guidelines. Table 2 shows the 'Research-Based Web Design and Usability Guidelines' of the National Cancer Institute [2005] and the main aspects of each guideline (aspects indicated by the first letter of their name, from Q = Quantitative to P = Pistic). Many have two aspects, sometimes because they cover two things (e.g. "set goals" (formative) and "state goals" (lingual)) and sometimes because the main idea is of two aspects (e.g. sharing is both lingual and ethical). We do not differentiate between qualifying and founding aspects here, but could do if a more precise analysis were needed. Table 2. Aspects of Web Design Guidelines {*** Think about, and discuss, the following: » Which aspects have most entries? » Why do you think this is? » Which aspects have least? » Why do you think this is? » Why are the formative and spatial aspects so important in web accessibility? ***} We can use aspectual analysis as a basis for critique. The first thing that strikes us is how many aspects are represented here. This is, of course, what one would expect from a good, mature set of guidelines such as the NCI guidelines are. Second, we might look for imbalance among the aspects. The spatial and formative aspects appear more often than most other aspects; we can ask ourselves whether this is appropriate. Perhaps more significant are some gaps, at least in this 2005 version, some of which are quite surprising: » The faith aspect of vision of who we are is completely absent, yet one might expect some mention of the designers' vision for the website. (It is possible that "Set goals" implies some pistic vision for the site.) » The ethical aspect of self-giving is present only in sharing design ideas. Guidelines on how to give the reader more than is actually due to them, and thus create a site that feels generous, would be useful. » The juridical aspect is almost absent, only represented tangentially in the concept of providing 'useful' or meaningful content. The juridical aspect would be relevant in terms of giving both the topic and the readers their due. » Perhaps most surprising is the almost complete absence of the social aspect - the two inclusions are rather tangential. Since websites are read by people from any and every cultural group, with varying background knowledge, expectations and world views, we might expect a whole set of guidelines on appropriate use of cultural connotations, humour, idiom, and on respecting cultural sensitivities. » The kinematic aspect is almost entirely absent. Animation can be used to show movement, but have the designers of these guidelines overlooked this, treating animation as a mere sensitive or aesthetic decoration? This aspectual analysis of these guidelines is not meant primarily as a criticism of the guidelines, which are excellent when compared with many others that are available, but rather to show how aspectual analysis can be useful as an evaluation tool, and how it might be used to suggest future improvements. Copyright (c) Andrew Basden, 16 September 2008, 18 October 2008. 3 September 2009, 22 September 2009, 25 November 2009, 20 September 2010.