libro error-humano-dekker
TEN QUESTIONSABOUT HUMAN ERROR:
A new view of human factors and system safety
Contenidos
Reconocimientos
Prefacio
Introducción de la serie
Nota del autor
1 ¿Fue Falla Mecánica o Error Humano?
2 ¿Por qué Fallan los Sistemas Seguros?
3 ¿Por qué son más Peligrosos los Doctores que los Propietarios de Armas?
4 ¿No Existen los Errores?
5 Si Ud. Pierde la Conciencia Situacional, ¿Qué la Reemplaza?
6 ¿Por qué los Operadores se vuelven Complacientes?
7 ¿Por qué no siguen ellos los Procedimientos?
8 ¿Podemos Automatizar los Errores Humanos Fuera del Sistema?
9 ¿Va a ser Seguro el Sistema?
10 ¿Debemos Hacer a la Gente Responsable por sus Errores?
Referencias
Índice del Autor Índice por Objetivo
Reconocimientos.
Tal como los errores, las ideas vienen de algún lado. Las ideas en este libro fueron desarrolladas en un
período de años en que las discusiones con las siguientes personas fueron particularmente constructivas:
David Woods, Erik Hollnagel, Nancy Leveson, James Nyce, John Flach, Gary Klein, Diane Vaughan, y
Charles Billings. Jens Rasmussen ha estado siempre delante en el juego en ciertas formas:
Algunas de las preguntas sobre el error humano ya fueron tomadas por él en décadas pasadas. Erik Hollnagel
fue instrumental en contribuir a moldear las ideas en el capítulo 6, y Jim Nyce ha tenido una influencia
significativa en el capítulo 9.
Quiero agradecer también a mis estudiantes, particularmente a Arthur Dijkstra y Margareta Lutzhoft, por sus
comentarios en borradores previos y sus útiles sugerencias. Margareta merece especial gratitud por su ayuda
en decodificar el caso estudiado en el capítulo 5, y Arthur, por su habilidad para señalar ―Ansiedad
Cartesiana‖, donde yo no la reconocí.
Agradecimiento especial al editor de series Barry Kantowitz y al editor Bill Webber, por su confianza en el
proyecto. El trabajo para este libro fue apoyado por una subvención del Swedish Flight Safety Directorate.
Prefacio
Los factores humanos en el transporte siempre han sido relacionados con el error humano. De hecho, como
un campo de investigación científica, debe su inclusión a investigaciones de error de piloto y a la subsiguiente
insatisfacción de los investigadores con la etiqueta.
En 1947, Paul Fitts y Richard Jones, construyendo sobre trabajo pionero por Alphonse Chapanis, demostraron
como características de las cabinas de aviones de la II Guerra Mundial influenciaban sistemáticamente la
forma en que pilotos cometían errores. Por ejemplo, pilotos confundían las palancas de flaps y tren de
aterrizaje porque éstas a menudo se veían y se sentían igual, y estaban ubicadas próximas una de otra
(switches idénticos, o palancas muy similares). En el incidente típico, un piloto podía levantar el tren de
aterrizaje en vez de los flaps, luego de un aterrizaje, con las previsibles consecuencias para hélices, motores
y estructura. Como un inmediato arreglo de guerra, una rueda de goma fue adherida al control del tren de
aterrizaje y un pequeño terminal con forma de cuña, al control del flan. Esto básicamente solucionó el
problema y el arreglo de diseño eventualmente llegó a ser un requerimiento de certificación.
Los pilotos podían también mezclar los controles de acelerador, mezcla y hélice, ya que sus ubicaciones
cambiaban en diferentes cabinas. Tales errores no fueron sorprendentes, degradaciones aleatorias de
desempeño humano. De preferencia, ellos fueron acciones y cálculos que tuvieron sentido una vez que los
investigadores comprendieron las características del mundo en que las personas trabajaban, una vez que
ellos hubieron analizado la situación que rodeaba al operador.
Los errores humanos están sistemáticamente conectados a características de las herramientas y tareas de las
personas. Puede ser difícil predecir cuándo o qué tan a menudo ocurrirán los errores (a pesar que las técnicas
de fiabilidad humana ciertamente han intentado). Con una examinación crítica del sistema en que las
personas trabajan, sin embargo, no es tan difícil anticipar dónde ocurrirán los errores. Factores Humanos ha
utilizado esta premisa desde siempre: La noción de diseñar sistemas resistentes y tolerantes al error se basa
en ello.
Factores Humanos fue precedido por una era de hielo mental de comportamientismo, en que cualquier
estudio de la mente era visto como ilegítimo y no científico. El comportamientismo en sí ha sido una psicología
de protesta, acuñada en agudo contraste entre la introspección experimental de Wundt que la precedió. Si el
comportamientismo fue una psicología de protesta, entonces factores humanos fue una psicología
pragmática. La Segunda Guerra Mundial trajo tal furioso paso de desarrollo tecnológico que el
comportamientismo fue encontrado manos abajo. Surgieron problemas prácticos en la vigilancia y toma de
decisiones del operador que fueron totalmente inmunes al repertorio de exhortaciones motivacionales del
comportamiento de Watson. Hasta ese punto, la psicología había asumido ampliamente que el mundo estaba
arreglado, y que los humanos tenían que adaptarse a sus demandas a través de la selección y el
entrenamiento. Factores Humanos mostró que el mundo no estaba arreglado: Cambios en el ambiente
podrían fácilmente llevar a incrementos en el desempeño no alcanzables mediante intervenciones
comportamientistas. En el comportamientismo, el rendimiento tenía que ser adaptado luego de las
características del mundo. En factores humanos, características del mundo fueron adaptadas luego de los
límites y capacidades del desempeño.
Como una psicología de lo pragmático, factores humanos adoptó la visión de ciencia y método científico
Cartesiano-Newtoniano (tal como Wundt y Watson habían hecho). Descartes y Newton fueron jugadores
dominantes en la revolución científica del siglo XVII.
Esta transformación total en el pensamiento instaló una creencia en la absoluta certeza del conocimiento
científico, especialmente en la cultura occidental. El ánimo de la ciencia fue de alcanzar el control al derivar
leyes de la naturaleza generales e idealmente matemáticas (tal como nosotros intentamos hacer para el
desempeño de la persona y el sistema). Una herencia de esto puede ser vista todavía en factores humanos,
particularmente en la predominación de experimentos, lo nomotético más que la inclinación ideográfica de su
investigación y una fuerte fe en el realismo de los hechos observados. También puede ser reconocida en las
estrategias reductivas con que se relacionan los factores humanos y la seguridad operacional de sistemas
para lidiar con la complejidad. La solución de problemas Cartesiano-Newtoniana es analítica. Consiste en
romper con los pensamientos y problemas en piezas y en arreglarlas en algún orden lógico. El fenómeno
necesita ser descompuesto en partes más básicas y su totalidad puede ser explicada exhaustivamente
haciendo referencia a sus componentes constituyentes y sus interacciones. En factores humanos y seguridad
operacional de sistemas, se entiende mente como una construcción tipo caja, con un intercambio mecánico
en representaciones internas; trabajo está separado en pasos procedimentales a través de análisis de tareas
jerárquicos; las organizaciones no son orgánicas o dinámicas, sino que están constituidas por estratos
estáticos y compartimientos y lazos; y seguridad operacional es una propiedad estructural que puede ser
entendida en términos de sus mecanismos de orden más bajo (sistemas de reporte, tasas de error y
auditorías, la función de la administración de seguridad operacional en el diagrama organizacional, y sistemas
de calidad).
Estas visiones están con nosotros hoy. Dominan el pensamiento en factores humanos y seguridad
operacional de sistemas. El problema es que extensiones lineares de estas mismas nociones no pueden
trasladarnos dentro del futuro. Las una vez pragmáticas ideas de factores humanos y seguridad de sistemas
están cayendo detrás de los problemas prácticos que han comenzado a surgir en el mundo de hoy.
Podríamos estar dentro para una repetición de los cambios que vinieron con los desarrollos tecnológicos de la
II Guerra Mundial, donde el comportamientismo mostró quedar corto. Esta vez podría ser el caso de factores
humanos y seguridad operacional de sistemas. Los desarrollos contemporáneos, sin embargo, no son solo
técnicos. Hay sociotécnicos: La comprensión sobre qué hace a los sistemas seguros o frágiles requiere más
que conocimiento sobre la interfase hombre-máquina. Como David Meister señaló recientemente (y el ha
estado cerca por un tiempo), factores humanos no ha progresado mucho desde 1950. ―Hemos tenido 50 años
de investigación‖, él se pregunta retóricamente, ―¿pero cuanto más de lo que sabíamos en un principio,
sabemos?‖ (Meister 2003, p. 5). No es que las propuestas tomadas por factores humanos y seguridad
operacional de sistemas ya no sean útiles, sino que su utilidad sólo puede ser apreciada realmente cuando
vemos sus límites. Este libro no es sino un capítulo en una transformación más larga que ha comenzado a
identificar las profundamente enraizadas restricciones y los nuevos puntos de influencia en nuestras visiones
de factores humanos y seguridad operacional de sistemas.
Las 10 preguntas acerca del error humano no son solo preguntas sobre el error humano como un fenómeno,
si es que lo son (y si el error humano es algo en y por sí mismo, en primer lugar). En realidad son preguntas
acerca de factores humanos y seguridad operacional de sistemas como disciplinas, y en qué lugar se
encuentran hoy. En formular estas preguntas acerca del error, y en trazar las respuestas a ellas, este libro
intenta mostrar dónde nuestro pensamiento corriente está limitado; dónde nuestro vocabulario, nuestros
modelos y nuestras ideas están limitando el progreso. En cada capítulo, el libro intenta entregar indicaciones
para nuevas ideas y modelos que tal vez se las puedan arreglar mejor con la complejidad de los problemas
que nos encaran ahora.
Uno de esos problemas es que sistemas aparentemente seguros pueden desviarse y fallar. Desviarse en
dirección a los márgenes de seguridad operacional ocurre bajo presiones de escasez y competencia. Está
relacionado con la opacidad de sistemas socio técnicos grandes y complejos, y los patrones de información
en que los integrantes basan sus decisiones y tratos. Derivaren fallas está asociado con los procesos
organizacionales normales de adaptación. Las fallas organizacionales en sistemas seguros no están
precedidas por fallas; lo están por el quiebre o carencia de calidad de componentes aislados. De hecho, la
falla organizacional en sistemas seguros está precedida por trabajo normal, por personas normales haciendo
trabajo normal en organizaciones aparentemente normales. Esto aparenta competir severamente con la
definición de un incidente, y puede minar el valor de reportar incidentes como una herramienta para aprender
más allá de un cierto nivel de seguridad operacional. El margen entre el trabajo normal y el incidente es
claramente elástico y sujeto a revisión incremental. Con cada pequeño paso fuera de las normas previas, el
éxito pasado puede ser tomado como una garantía de seguridad operacional futura.
El incrementalismo mella el sistema completo crece de la línea de derrumbe, pero sin indicaciones empíricas
poderosas de que está encaminado de esa forma.
Modelos corrientes de factores humanos y seguridad operacional de sistemas no pueden lidiar con la
derivación hacia fallas. Ellos requieren fallas como un prerrequisito para las fallas. Ellos aún están orientados
hacia el encuentro de fallas (por ejemplo, errores humanos, hoyos en las capas de defensa, problemas
latentes, deficiencias organizacionales y patógenos residentes), y se relacionan con niveles de trabajo y
estructura dictados externamente, por sobre tomar cuentas internas (sobre qué es una falla vs. Trabajo
normal) como cánones. Procesos de toma de sentido, de la creación de racionalidad local por quienes de
verdad realizan los miles de pequeños y mayores tratados que transportan un sistema a lo largo de su curso
de deriva, yacen fuera del léxico actual de factores humanos. Los modelos corrientes típicamente ven a las
organizaciones como máquinas Newtonianas-cartesianas con componentes y nexos entre ellas. Los
contratiempos son modelados como una secuencia de eventos (acciones y reacciones) entre un disparador y
un resultado. Tales modelos no pueden pronunciarse acerca de la construcción de fallas latentes, ni sobre la
gradual, incremental soltura o pérdida de control.
Los procesos de erosión de las restricciones, de detrimentos de la seguridad operacional, de desviación hacia
los márgenes no pueden ser capturados porque los enfoques estructurales son metáforas estáticas para
formas resultantes, no modelos dinámicos orientados hacia procesos de formación.
Newton y Descartes, con sus particulares estudios en ciencias naturales, tienen una firme atadura en factores
humanos, seguridad operacional de sistemas y también en otras áreas. El paradigma de procesamiento de
información, por ejemplo, tan útil para explicar tempranamente los problemas de transferencia de información
entre radar y radio operadores en la II Guerra Mundial, solo ha colonizado la investigación de factores
humanos. Aún es una fuerza dominante, reforzado por los experimentos del Spartan laboratory, que parecen
confirmar su utilidad y validez. El paradigma tiene mente mecanizada, partida en componentes separados (por
ejemplo, memoria de trabajo, memoria de corto plazo y memoria de largo plazo) con nexos entre medio.
Newton habría amado su mecánica. A Descartes también le habría gustado: Una separación clara entre
mente y mundo solucionaba (o circunvalada, más bien) una serie de problemas asociados con las
transacciones entre ambos.
Un modelo mecánico tal como procesamiento de información, claro que mantiene apego especial por la
ingeniería y otros consumidores de los resultados de la investigación de factores humanos. Dictado
pragmático salvando las diferencias entre práctica y ciencia, y el tener un modelo cognitivo similar a un
aparato técnico familiar para gente aplicada, es una forma poderosa de hacer sólo eso. Pero no existe razón
empírica para restringir nuestra comprensión de actitudes, memorias o heurísticos, como disposiciones
codificadas mentalmente, como ciertos contenidos de conciencia con determinadas fechas de vencimiento.
De hecho, tal modelo restringe severamente nuestra habilidad para comprender cómo las personas utilizan el
habla y la acción para construir un orden perceptual y social; cómo, a través del discurso y la acción, las
personas crean los ambientes que, a cambio, determinan la acción posterior y asesorías posibles, y que
restringen lo que, en consecuencia, será visto como discurso aceptable o decisiones racionales.
No podemos comenzar a entender la deriva en fallas, sin comprender cómo grupos de personas, a través de
cálculo y acción, ensamblan versiones del mundo en las que ellos calculan y actúan.
El procesamiento de la información cabe dentro de una perspectiva metateórica mayor y dominante, que toma
al individuo como su foco central (Heft, 2001). Esta visión, también, es una herencia de la Revolución
Científica, la que ha popularizado incrementadamente la idea humanista de un ―individuo auto contenido‖.
Para la mayoría de la psicología, esto ha significado que todos los procesos dignos de estudio toman lugar
dentro de los márgenes del cuerpo (o mente), algo epitomizado por el enfoque mentalista del procesamiento
de información. En su incapacidad para tratar significativamente la deriva hacia la falla, que interconecta
factores individuales, institucionales, sociales y técnicos, los factores humanos y la seguridad operacional de
sistemas están actualmente pagando por su exclusión teórica de los procesos sociales y transaccionales,
entre los individuos y el mundo. El componencialismo y la fragmentación de la investigación de factores
humanos aún es un obstáculo al progreso en este sentido. Un estiramiento de la unidad de análisis (como lo
hecho en las ideas de ingeniería de sistemas cognitivos y cognición distribuida), y una llamada actuar
centralmente en comprender los cálculos y el pensamiento, han sido formas de lidiar con los nuevos
desarrollos prácticos para los que los factores humanos y la seguridad de sistemas, no estaban preparados.
El énfasis individualista del protestantismo y la iluminación, también reboza de ideas sobre control y culpa.
¿Debemos culpar a las personas por sus errores? Los sistemas sociotécnicos han crecido en complejidad y
tamaño, moviendo a algunos a decir que no tiene sentido esperar o demandar de los integrantes (ingenieros,
administradores, operadores), que giren en torno a algún ideal moral reflectivo. Presiones de escasez y
competencia, han logrado convertirse insidiosamente en mandatos organizacionales e individuales, los que a
cambio, restringen severamente la racionalidad y opciones (y por ende autonomía), de todos los actores en el
interior. Ya sólo los antihéroes continúan teniendo roles líderes en nuestras historias de fallas. El
individualismo aún es crucial para la propia identidad en la modernidad. La idea que lleva a un trabajo de
equipo, a una organización entera, o a toda una industria a quebrar un sistema (como se ilustró mediante
casos de deriva en fallas) es muy poco convencional respecto de nuestras preconcepciones culturales
heredadas. Incluso antes que llegáramos a episodios complejos de acción y responsabilidad, podemos
reconocer la prominencia de la de-construcción y componentalismo Newtoniano-Cartesianos, en mucha
investigación de factores humanos. Por ejemplo: Las nociones empíricas de una percepción de elementos
que gradualmente se fueron convirtiendo en significado, a través de etapas de procesamiento mental, son
nociones teóricas legítimas hoy.
El empirismo fue otrora una fuerza en la historia de la psicología. Incluso atascado por el paradigma de
procesamiento de información, sus principios centrales han retrocedido, por ejemplo, en teorías de conciencia
situacional. En adoptar un modelo cultural como tal, desde una comunidad aplicada y sometiéndolo a
escrutinio científico putativo, por supuesto que los factores humanos encuentran su ideal pragmático. Los
modelos culturales abarcan los problemas de factores humanos como una disciplina aplicada. Pocas teorías
pueden cubrir el abismo entre investigador y practicantes mejor que aquellas que aplican y disectan los
vernáculos practicantes para estudio científico. Pero los modelos culturales vienen con una etiqueta de precio
epistemiológica. La investigación que se adjudica indagar un fenómeno (digamos, conciencia situacional
dividida, o complacencia), pero que no definen ese fenómeno (porque, como modelo cultural, se supone que
todos saben lo que significa) no puede falsear el contacto con la realidad empírica. Ello deja a tal investigador
de factores humanos sin el mecanismo mayor de control científico desde Kart Popper.
Conectado al procesamiento de información, y al enfoque experimental a muchos problemas de factores
humanos, está un prejuicio cuantitativo, primero competido en la psicología por Wilhelm Wundt, en su
laboratorio en Leipzig. A pesar de que Wundt rápidamente tuvo que admitir que una cronometría de la mente
era una meta muy audaz de la investigación, los proyectos experimentales de investigación sobre factores
humanos aún pueden reflejar versiones pálidas de su ambición. Contar, medir, categorizar y analizar
estadísticamente, son herramientas gobernantes del tratado, mientras que las investigaciones cualitativas son
a menudo desechadas por subjetivas y no científicas. Los factores humanos tienen una orientación realista,
creyendo que los hechos empíricos son aspectos estables y objetivos de la realidad que existe independiente
del observador o su teoría. Nada de esto hace menos reales los hechos generados mediante experimentos,
para aquellos que observan, publican, o leen acerca de ellos. Sin embargo, al apreciar a Thomas Kuhn
(1962), esta realidad debe ser vista por lo que es: un acuerdo negociado implícitamente entre investigadores
de pensamientos similares, más que un común denominador accesible a todos.
No hay arbitrio final aquí. Es posible que un enfoque experimental, componencial, pueda disfrutar de un
privilegio epistemiológico. Pero ello también significa que no hay imperativo automático para únicamente
sostenerse por la investigación legítima, como se ve a veces en la corriente principal de factores humanos.
Las formas de obtener acceso a la realidad empírica son infinitamente negociables, y su aceptación es una
función de qué tan bien ellos dan conformidad a la visión mundial de aquellos a quienes el investigador apela.
La persistente supremacía cuantitativista (particularmente en los factores humanos norteamericanos), se ve
apesadumbrada con este tipo de autoridad consensuada (debe ser bueno porque todos lo están haciendo).
Tal histéresis metodológica podría tener que ver más con los miedos primarios de ser marcado ―no científico‖
(los miedos compartidos por Wundt y Watson) que con un retorno estable de incrementos significativos de
conocimiento generados por la investigación.
El cambio tecnológico dio impulso a los pensamientos de factores humanos y seguridad de sistemas. Las
demandas prácticas puestas por los cambios tecnológicos envolvieron a los factores humanos y la seguridad
de sistemas con el espíritu pragmático que hasta hoy tienen. Pero lo pragmático no es más pragmático si no
encaja con las demandas creadas por aquello que está sucediendo ahora en nuestro alrededor. El paso del
cambio sociotecnológico no tiende a desacelerar pronto. Si creemos que la II Guerra Mundial generó una gran
cantidad de cambios interesantes, dando a luz a los factores humanos como una disciplina, entonces
podríamos estar viviendo en tiempos incluso más excitantes hoy.
Si nosotros nos mantenemos haciendo lo que hemos estado realizando en factores humanos y seguridad de
sistemas, simplemente porque nos ha funcionado en el pasado, podríamos llegar a ser uno de esos sistemas
que derivan hacia la falla. Lo pragmático requiere que nosotros nos adaptemos también, para arreglárnoslas
mejor con la complejidad del mundo que nos enfrenta hoy. Nuestros éxitos pasados no son garantía de
continuar logros futuros.
Prólogo de la serie.
Barry H. Kantowitz
Battelle Human Factors Transportation Center
El rubro del transporte es importante, por razones tanto prácticas como teóricas. Todos nosotros somos
usuarios de sistemas de transporte como operadores, pasajeros y consumidores. Desde un punto de vista
científico, el rubro del transporte ofrece una oportunidad de crear y probar modelos sofisticados de
comportamiento y cognición humanos. Esta serie cubre los aspectos práctico y teórico de los factores
humanos en el transporte, con un énfasis en su interacción.
La serie es interpretada como un foro para investigadores e ingenieros interesados en cómo funcionan las
personas dentro de sistemas de transporte. Todos los modos de transporte son relevantes, y todos los
esfuerzos en factores humanos y ergonomía que tienen implicancias explícitas para los sistemas de
transporte caen en una visión pobre en serie. Esfuerzos analíticos son importantes para relacionar teoría y
datos. El nivel de análisis puede ser tan pequeño como una persona, o de espectro internacional. Los datos
empíricos pueden provenir de un amplio rango de metodologías, incluyendo investigación de laboratorio,
estudios simulados, seguimiento de pruebas, pruebas operacionales, trabajo en el campo, revisiones de
diseños, o peritajes. Este amplio espectro es interpretado para maximizar la utilidad de la serie para lectores
con trasfondos distintos.
Espero que la serie sea útil para profesionales en las disciplinas de factores humanos, ergonomía, ingeniería
de transportes, psicología experimental, ciencia cognitiva, sociología e ingeniería de seguridad operacional.
Está orientada a la apreciación de especialistas de transporte en la industria, gobierno, o académicos, así
como también, al investigador en busca de una base de pruebas para nuevas ideas acerca de la interfase
entre las personas y sistemas complejos.
Este libro, mientras se enfoca en el error humano, ofrece una visión de sistema particularmente bienvenida en
los factores humanos del transporte. Una meta mayor de esta serie de libros es relacionar la teoría y la
práctica de factores humanos. El autor es encomendado para formular preguntas que no sólo relacionan
teoría y práctica, sino que fuerzan al lector a evaluar las clases de teoría como las aplicadas a factores
humanos. Los enfoques de información tradicionales, derivados del modelo de canal limitado que ha formado
las bases originales para el trabajo teórico en factores humanos, son escrutados. Enfoques más nuevos,
tales como la conciencia situacional, que procedía de deficiencias en el modelo de teoría de la información,
son criticados por tratarse solo de modelos culturales carentes de rigor científico.
Espero que este libro engendre un vigoroso debate sobre qué clases de teoría sirven mejor a la ciencia de
factores humanos. Si bien, las diez preguntas ofrecidas aquí forman una base para debate, existen más de
diez respuestas posibles.
Los libros posteriores en esta serie, continuarán buscando estas respuestas mediante la entrega de
perspectivas prácticas y teóricas en los factores humanos en el transporte.
Nota del Autor.
Sidney Dekker es profesor de Factores Humanos en la Universidad Lund, Suecia. El recibió un M.A. en
psicología organizacional de la University of Nijmegen y un M.A. en psicología experimental de la Leiden
University, ambas en Noruega. El ganó su Ph.D. en Ingeniería de Sistemas Cognitivos de la Ohio State
University.
Ha trabajado previamente para el Public Transport Cooperation in Melbourne, Australia; la Massey University
School of Aviation, Nueva Zelanda; y la British Aerospace. Sus especialidades e intereses investigativos son
el error humano, investigación de accidentes, estudios de campo, diseño representativo y, automatización.
Ha tenido alguna experiencia como piloto, entrenado en material DC-9 y Airbus A340. Sus libros previos
incluyen The Field Guide to Human Error Investigations (2002).
Capítulo 1.
¿Fue Falla Mecánica o Error Humano?
Estos son tiempos excitantes y competitivos para factores humanos y seguridad operacional de sistemas. Y
existen indicaciones sobre que no estaremos completamente bien equipados para ellos. Hay un
reconocimiento creciente que los accidentes (un accidente de avión comercial, un desastre de un
transbordador espacial) están intrincadamente ligados al funcionamiento de organizaciones e instituciones
aledañas. La operación de aviones de aerolíneas comerciales o transbordadores espaciales o traslados de
pasajeros, engendra vastas redes de organizaciones de apoyo, de mejoramiento y avance, de control y
regulación. Tecnologías complejas no pueden existir sin estas organizaciones e instituciones –
transportadores, reguladores, agencias de gobierno, fabricantes, subcontratistas, instalaciones de
mantenimiento, grupos de entrenamiento – que, en principio, están diseñadas para proteger y dar seguridad a
su operación. Su mandato real se orienta a no tener accidentes. Desde el accidente nuclear en Three Mile
Island, en 1978, sin embargo, las personas se percatan en mayor medida que las mismas organizaciones
destinadas a mantener una tecnología segura y estable (operadores humanos, reguladores, la administración,
el mantenimiento), están en realidad entre los mayore contribuyentes al quiebre. Las fallas socio-tecnológicas
son imposibles sin tales contribuciones.
En desmedro de este reconocimiento creciente, factores humanos y seguridad operacional de sistemas
dependen de un vocabulario basado en una concepción particular de las ciencias naturales, derivada de sus
raíces en la ingeniería y en la psicología experimental. Este vocabulario, el uso sutil de metáforas, imágenes e
ideas, está más y más de acuerdo con las demandas interpretativas puestas por los accidentes
organizacionales modernos. El vocabulario expresa una visión mundial (tal vez), apropiada para las fallas
técnicas, pero incapaz de abrazar y penetrar las áreas relevantes de fallas socio-técnicas – esas fallas que
incorporan los efectos interconectados de la tecnología y de la complejidad social organizada circundando su
uso. Lo que significa, más fallas hoy.
Cualquier lenguaje, y la visión mundial que lo acompaña, imponen limitaciones en nuestro entendimiento de la
falla. Sin embargo, estas limitaciones ahora están volviéndose incrementadamente evidentes y presionantes.
Con el crecimiento en el tamaño y la complejidad del sistema, la naturaleza de los accidentes está cambiando
(accidentes de sistemas, fallas sociotécnicas).
La escasez y competitividad por recursos significa que los sistemas presionan incrementadamente sus
operaciones hacia los bordes de sus coberturas de seguridad. Ellos tienen que hacerlo para permanecer
exitosos en sus ambientes dinámicos. Los retornos comerciales al estar en los límites son mayores, pero las
diferencias entre tener y no tener un accidente están caóticamente superando los márgenes disponibles. Los
sistemas abiertos son remolcados continuamente hacia dentro de sus áreas de seguridad operacional, y los
procesos que impulsan tal migración no son sencillos de reconocer o controlar, como tampoco la ubicación
exacta de los márgenes. Los sistemas grandes, complejos, se ven capaces de adquirir una histéresis, una
oscura voluntad propia, en la que derivan hacia mayor elasticidad o hacia los bordes de la falla. Al mismo
tiempo, el veloz avance de los cambios tecnológicos crea nuevos tipos de peligros, especialmente aquellos
que vienen con mayor dependencia en la tecnología computacional. Ambos sistemas, social e ingeniero (y su
interrelación), se relacionan a un siempre mayor volumen de tecnología de información. A pesar de nuestra
velocidad computacional y de que el acceso a la información pudiera parecer una ventaja de seguridad
operacional en principio, nuestra habilidad de tomar conciencia de la información no está manteniendo el
paso con nuestra habilidad para recolectarla y generarla. Al conocer más, puede que en realidad conozcamos
mucho menos. Administrar la seguridad operacional en base a números (incidentes, conteos de error,
amenazas a la seguridad operacional), como si la seguridad operacional fuera sólo otro indicador de un
modelo de negocios de Harvard, puede crear una falsa impresión de racionalidad y control administrativo.
Puede ignorar variables de orden más alto que pueden develarla verdadera naturaleza y dirección de la deriva
del sistema. Podría venir además, al costo de comprensiones más profundas del funcionamiento socio-técnico
real.
DECONSTRUCCION, DUALISMO y ESTRUCTURALISMO.
¿Entonces qué es este idioma y la visión mundial técnica obsoleta que representa? Las características que lo
definen son la deconstrucción, el dualismo y el estructuralismo. Deconstrucción significa que el
funcionamiento de un sistema puede ser comprendido exhaustivamente al estudiar la distribución y la
interacción de sus partes constituyentes. Científicos e ingenieros típicamente miran al mundo de esta forma.
Las investigaciones de accidentes también deconstruyen. Para determinar la falla mecánica, o para ocasionar
las partes dañinas, los investigadores de accidentes hablan de ―ingeniería reversa‖. Ellos recuperan partes de
los restos y las reconstruyen en un todo nuevamente, a menudo literalmente. Pensemos en el TWA800
Boeing 747 que explotó en el aire luego del despegue desde el aeropuerto Kennedy de Nueva York, en 1998.
Fue recuperado desde el fondo del Océano Atlántico y dolorosamente rearmado, en un hangar. Con el
rompecabezas lo más completo como posible, las partes dañadas debieron eventualmente quedar expuestas,
permitiendo a los investigadores identificar la fuente de la explosión. Pero continúa desafiando al sentido,
continúa siendo un rompecabezas sólo cuando el funcionamiento (o no funcionamiento) de sus partes falla al
explicar el todo. La parte que causó la explosión, que la inició, nunca fue identificada en verdad. Esto es lo
que hace escalofriante la investigación del TWA800. En desmedro de una de las más caras reconstrucciones
en la historia, las partes reconstruidas rechazaron contar por el comportamiento del todo. En un caso como
tal, una comprensión atemorizante, incierta, da escalofríos a los cuerpos de investigación y a la industria. Un
todo falló sin una parte fallada. Un accidente ocurrió sin una causa; no hay causa – nada que reparar, nada
que reparar – y podría suceder mañana nuevamente, u hoy.
La segunda característica definitoria es el dualismo. Dualismo significa que existe una separación distintiva
entre causa humanan y material – entre el error humano y la falla mecánica --. Para ser un buen dualista
usted, por supuesto, tiene que deconstruir: Usted tiene que desconectar las contribuciones humanas de las
contribuciones mecánicas. Las reglas de la Organización de Aviación Civil Internacional, que gobierna a los
investigadores de accidentes aéreos lo determinan expresamente. Ellos fuerzan a los investigadores de
accidentes a separar las contribuciones humanas de las mecánicas. Parámetros específicos en los reportes
de accidentes están reservados para el seguimiento de los componentes humanos potencialmente dañados.
Los investigadores exploran el historial de las 24 y 72 horas previas de los humanos que más tarde se verían
involucrados en un accidente. ¿Hubo alcohol? ¿Hubo estrés? ¿Hubo fatiga? ¿Hubo falta de eficiencia o
experiencia? ¿Hubo problemas previos en los registros de entrenamiento u operacionales de estas personas?
¿Cuántas horas de vuelo tenía verdaderamente el piloto? ¿Hubo otras distracciones o problemas? Este
requisito investigativo refleja una interpretación primitiva de los factores humanos, una tradición aeromédica
en que el error humano está reducido a la noción de ―estar en forma para el servicio‖. Esta noción ha sido
sobrepasada hace tiempo por los desarrollos en factores humanos hacia el estudio de personas normales
realizando trabajos normales en lugares de trabajo normales (más que en individuos deficientes mental o
fisiológicamente), pero el modelo aeromédico sobre extendido es retenido como una clase de práctica
conformista positivista, dualista y deconstructiva. En el paradigma de estar en forma para el servicio, las
fuentes de error humano debieron ser buscadas en las horas, días o años previos al accidente, cuando el
componente humano estaba torcido, debilitado y listo para el quiebre. Encuentre la parte del humano que
estaba perdida o deficiente, la ―parte desajustada‖, y la parte humana acarreará la carga interpretativa del
accidente. Indague en la historia reciente, encuentre las piezas deficientes y arme el rompecabezas:
deconstrucción, reconstrucción y dualismo.
La tercera característica definitoria de la visión mundial técnica que aún gobierna nuestro entendimiento de
éxito y fallas en sistemas complejos es el estructuralismo. El idioma que utilizamos para describir los
trabajadores internos de sistemas de éxito y fallas es un idioma de estructuras. Hablamos de capas de
defensa, de agujeros en estas capas. Identificamos los ―bordes suaves‖ y los ―bordes agudos‖ de
organizaciones e intentamos capturar como una tiene efectos sobre la otra. Incluso la cultura de seguridad es
tratada como una estructura edificada por otros bloques. Qué tanta cultura de seguridad tenga una
organización depende de las partes y componentes que tenga para el reporte de incidentes (esto es
mesurable), de hasta qué punto es justa con los operadores que cometen errores (esto es más difícil de
medir, pero todavía posible), y de qué relación tiene entre sus funciones de seguridad y otras estructuras
institucionales. Una realidad social profundamente compleja está por ende, reducida a un limitado número de
componentes mesurables. Por ejemplo ¿tiene el departamento de seguridad una ruta directa a la
administración más alta? ¿Cómo es esta tasa de reportes comparada a otras compañías?
Nuestro idioma de fallas también es un idioma de mecánica. Describimos trayectorias de accidentes,
buscamos causas y efectos, e interacciones. Buscamos fallas iniciadoras, o eventos gatilladores, y seguimos
el colapso del sistema estilo dominó, que le sigue.
Esta visión mundial ve a los sistemas socio-técnicos como máquinas con partes en una distribución particular
(bordes agudos vs. suaves, capas de defensa), con interacciones particulares (trayectorias, efectos dominó,
gatillos, iniciadores), y una mezcla de variables independientes o intervinientes (cultura de la culpa vs. cultura
de seguridad). Esta es la visión mundial heredada de Descartes y Newton, la visión mundial que ha impulsado
exitosamente el desarrollo tecnológico desde la revolución científica hace medio milenio.
La visión mundial, y el idioma que produce, está basada en nociones particulares de ciencias naturales, y
ejerce una sutil pero muy poderosa influencia en nuestra comprensión del éxito y falla sociotecnológicos hoy.
Así como ocurre con mucha de la ciencia y pensamiento occidentales, perdura y dirige la orientación de
factores humanos y seguridad de sistemas.
Incluso el idioma, si se utiliza irreflexivamente, se vuelve fácilmente aprisionante. El idioma expresa, pero
también determina qué podemos ver y cómo lo vemos.
El idioma constriñe como construimos la realidad. Si nuestras metáforas nos dieran el coraje para modelar
cadenas de accidentes, entonces comenzaremos nuestra investigación buscando eventos que encajen en esa
cadena. ¿Pero qué eventos deben ir adentro? ¿Dónde debemos comenzar? Como Nancy Leveson (2002)
señaló, la elección de cuales eventos poner dentro es arbitraria, así como la extensión, el punto de partida y el
nivel de detalle de la cadena de eventos. ¿Qué, preguntó ella, justifica asumir que los eventos iniciales son
mutuamente exclusivos, excepto que ello simplifica las matemáticas del modelo de la falla? Estos aspectos de
la tecnología y de su operación, encumbran preguntas sobre lo apropiado del modelo dualista, deconstructivo,
estructuralista que domina factores humanos y seguridad de sistemas. En su lugar, podríamos buscar una
visión de sistemas real, que no solo apunte las deficiencias estructurales detrás de los errores humanos
individuales (si se necesita ello lo puede hacer), pero que aprecia la adaptabilidad orgánica, ecológica, de
sistemas sociotécnicos complejos.
Buscando fallas para explicar fallas.
Nuestras creencias y credos más arraigados a menudo permanecen encerrados en la más simple pregunta.
La pregunta acerca de si el error humano o la falla mecánica es uno de ellos. ¿Fue el accidente causado por
falla mecánica o por error humano? Es una pregunta existencial para las repercusiones posteriores de un
accidente.
Más aún, se ve como una pregunta tan simple e inocente. Para muchos es una consulta normal de preguntar:
Si has tenido un accidente, tiene sentido averiguar que falló. La pregunta, sin embargo, envuelve una
comprensión particular de cómo los accidentes ocurren, y sus riesgos confinando nuestro análisis causal de
esa comprensión. Nos presenta en un repertorio interpretativo arreglado. Escapar de este repertorio puede
ser difícil. Fija las preguntas que hacemos, entrega las cabezas que perseguimos y las claves que
examinamos, y determina las conclusiones que eventualmente esbozaremos. ¿Qué componentes estaban
dañados? ¿Fue algo maquinario o algo humano? ¿Por cuanto tiempo había estado torcido el componente o,
de otra forma, deficiente? ¿Por qué se quebró eventualmente? ¿Cuáles fueron los factores latentes que
conspiraron en su contra? ¿Qué defensas hubieron erosionado?
Estos son los tipos de preguntas que dominan las investigaciones en factores humanos y seguridad de
sistemas hoy en día. Organizamos reportes de accidentes y nuestro discurso sobre accidentes alrededor de la
lucha por respuestas a ellos. Las investigaciones dan vuelta los componentes mecánicos dañados (un perno
dañado en el trim vertical de un MD-80 de Alaska Airlines, azulejos refractantes de calor perforados en el
transbordador espacial Columbia), componentes de baja performance humana (por ejemplo, quiebres en
C.R.M., un piloto que tiene un accidentado historial de entrenamiento), y grietas en las organizaciones
responsables por el rodaje del sistema (por ejemplo, cadenas de decisión organizacional débiles). El buscar
fallas – humanas, mecánicas, u organizacionales – para explicar fallas es tan de sentido común que la
mayoría de los investigadores nunca se detiene a pensar si estas son en realidad las pistas correctas que
perseguir.
Que la falla está causada por falla es pre racional – no la consideramos conscientemente más como una
pregunta en las decisiones que hacemos acerca de dónde mirar y qué concluir.
Aquí hay un ejemplo. Un bimotor Douglas DC-9-82 aterrizó en un aeropuerto regional en las Tierras Altas del
Sur de Suecia en el verano de 1999. Chubascos de lluvia habían pasado a través del área más temprano, y la
pista estaba aún húmeda. Durante la aproximación a la pista, la aeronave recibió un ligero viento de cola, y
después del toque a tierra, la tripulación tuvo problemas disminuyendo la velocidad. A pesar de los esfuerzos
de la tripulación por frenar, el jet recorrió la pista y terminó en un campo a unos pocos cientos de pies del
umbral. Los 119 pasajeros y la tripulación abordo resultaron ilesos.
Luego de su parálisis, uno de los pilotos salió de la aeronave para chequear los frenos. Estaban fríos. No
había ocurrido ninguna acción de freno. ¿Cómo pudo haber ocurrido esto? Los investigadores no encontraron
fallas mecánicas en la aeronave. Los sistemas de freno estaban bien.
En vez de ello, a medida que la secuencia de eventos fue rebobinada en el tiempo, los investigadores se
percataron que la tripulación no había armado los ground spoilers de la aeronave antes del aterrizaje. Los
ground spoilers ayudan a un jet a frenar durante la carrera, pero requieren ser armados antes de que puedan
hacer su trabajo. Armarlos es trabajo de los pilotos, y es un ítem de la lista de chequeo before-landing y parte
de los procedimientos en que ambos miembros de la tripulación están envueltos. En este caso, los pilotos
olvidaron armar los spoilers. ―Error de piloto‖, concluyó la investigación.
O realmente, ellos lo llamaron ―Desmoronamientos en CRM (Crew Resource Management)‖ (Statens
Haverikommision, 2000, p.12), una forma más moderna, más eufemista de decir ―error de piloto‖. Los pilotos
no coordinaron lo que debían hacer; por alguna razón ellos fallaron en comunicar la configuración requerida
de su aeronave. Además, después del aterrizaje, uno de los miembros de la tripulación no había dicho
―¡Spoilers!‖, como lo dicta el procedimiento. Esto pudo o debió alertar a la tripulación sobre la situación, pero
ello no ocurrió. Los errores humanos habían sido encontrados. La investigación estaba concluida.
―Error humano‖ es nuestra elección por defecto cuando no encontramos fallas mecánicas. Es una elección
forzada, inevitable, que se calza suficientemente bien en una ecuación, donde el error humano es el inverso al
monto de la falla mecánica. La ecuación 1 muestra como determinamos la proporción de responsabilidad
causal:
Error humano = f (1 – falla mecánica) (1)
Si no existe falla mecánica, entonces sabemos qué comenzar a buscar en reemplazo.
En este caso, no hubo falla mecánica. La ecuación 1 viene como una función de 1 menos 0. La contribución
humana fue 1. Fue error humano, un quiebre de CRM. Los investigadores encontraron que los dos pilotos a
bordo del MD-80 eran ambos capitanes, y no un capitán y un copiloto, como es usual. Fue una simple
coincidencia de una planificación no completamente inusual, un ajuste elástico volar a bordo de esa aeronave
desde esa mañana. Con dos capitanes en un barco, las responsabilidades arriesgan ser divididas inestable e
incoherentemente.
La división de responsabilidades fácilmente conduce a su abdicación. Si es función del copiloto verificar que
los spoilers estén armados, y no hay copiloto, el riesgo es obvio. La tripulación estaba en algún sentido
―desajustada‖, o a lo menos, propensa al desmoronamiento. Así fue (hubo un ―desmoronamiento de CRM‖).
¿Pero qué explica esto? Estos son procesos que ellos mismos requieren una explicación, y pueden ser guías
que se enfríen de todas formas. Tal vez hay una realidad mucho más profunda, acechando entre las primeras
acciones particulares de un accidente como tal, una realidad en donde las causas humanas y mecánicas
están interconectadas de forma mucho más profunda, que en nuestros enfoques formulaicos que nos
permiten comprender investigaciones. Para vislumbrar mejor esta realidad, primero tenemos que girar hacia el
dualismo.
Es el dualismo que descansa en el corazón de la elección entre error humano y falla mecánica. Echamos un
breve vistazo a su pasado y lo confrontamos con el encuentro empírico inestable, incierto, de un caso de
spoilers desarmados.
La miseria del dualismo.
La urgencia de separar la causa humana de la causa mecánica es algo que debe haber encrucijado incluso a
los pioneros en factores humanos. Pensar en el enredo con las cabinas de la II Guerra Mundial, que tenían
switches de control idénticos para una diversidad de funciones. ¿Pudieron evitar, una aleta tipo flap en el
control del flap y un mecanismo con forma de rueda en el elevador del tren, la confusión típica entre ambos?
En ambos casos, el sentido común y la experiencia dice ―sí‖. Al cambiar algo en el mundo, los ingenieros en
factores humanos (suponiendo que ellos ya existían) cambiaron algo en el humano. Al jugar con el hardware
con que las personas trabajaban, ellos cambiaron el potencial de las acciones correctas e incorrectas, pero
sólo el potencial. Porque incluso con palancas de control con formas funcionales, algunos pilotos, en algunos
casos, todavía las mezclaban. Al mismo tiempo, los pilotos no siempre mezclaban switches idénticos.
Similarmente, no todas las tripulaciones que constan de dos capitanes fallan al armar los spoilers antes del
aterrizaje. El error humano, en otras palabras, está suspendido, inestable, en algún lado entre las interfases
humanas y mecánicas. El error no es completamente humano, ni completamente mecánico. Al mismo tiempo,
―fallas‖ mecánicas (proveer switches idénticos ubicados próximos uno del otro) tienen que expresarse ellos
mismos en la acción humana. Así que, si ocurre una confusión entre flaps y tren, entonces ¿cuál es la causa?
¿Error humano o falla mecánica? Usted necesita ambos para tener éxito; necesita que ambos fallen. Donde
termina uno y comienza el otro ya no está claro.
Una idea del trabajo temprano en factores humanos era que el componente mecánico y la acción humana
estaban interconectados en formas que resisten el desarreglo dualista, deconstruido, eficiente, preferido aún
hoy por los investigadores (y sus consumidores).
DUALISMO Y REVOLUCIÓN CIENTÍFICA.
La elección entre causa humana y causa material no es un simple producto de la investigación de accidentes
o la ingeniería en factores humanos recientes. La elección se encuentra firmemente arraigada a la visión
mundial Newtoniana-Cartesiana que gobierna mucho de nuestro pensamiento hoy en día, particularmente en
profesiones dominadas por la tecnología como la ingeniería de factores humanos y la investigación de
accidentes.
Isaac Newton y Rene Descartes fueron dos de las figuras cúspide en la Revolución Científica entre 1500 y
1700 D.C. quienes produjeron un cambio dramático en la visión mundial, así como también, cambios
profundos en el conocimiento y en las ideas de cómo adquirir y probar el conocimiento. Descartes propuso
una aguda distinción entre lo que llamó res cogitans, el dominio de la mente, y res extensa, el dominio del
problema. Aunque Descartes admitió alguna interacción entre los dos, insistió que el fenómeno mental y físico
no puede ser entendido haciendo referencia al otro. Los problemas que ocurren en cualquiera de los dominios
requieren enfoques completamente separados y diferentes conceptos para resolverlos. La noción de mundos
mentales y materiales separados llegó a ser conocida como dualismo y sus implicancias pueden ser
reconocidas en mucho de lo que pensamos y hacemos hoy en día. De acuerdo a Descartes, la mente está
fuera del orden físico del problema y en ninguna forma es derivada de él. La elección entre error humano y
falla mecánica, es tal elección dualista:
De acuerdo a la lógica Cartesiana, el error humano no puede derivar de cosas materiales. Como veremos,
esta lógica no se sustenta bien – de hecho, en una inspección más cercana, todo el campo de factores
humanos está basado en esta afirmación.
Separar el cuerpo del alma, y subordinar el cuerpo al alma, no sólo mantuvo a Descartes guerra de problemas
con la Iglesia. Su dualismo, su división entre mente y problema, agregó un importante problema filosófico que
tuvo el potencial de sostener el progreso científico, tecnológico y social: ¿Cuál es el nexo entre mente y
problema, entre el alma y el mundo material? ¿Cómo podríamos, como humanos, tomar el control y rehacer
nuestro mundo físico lo suficiente para que estuviera aleado indivisiblemente, o incluso fuera sinónimo, con un
alma irreductible, eterna? Una de las mayores aspiraciones durante la Revolución Científica de los siglos XVI
y XVII fue el ver y comprender (y llegar a tener la capacidad de manipular) el mundo material como una
máquina controlable, predictible, programable. Esto lo requirió para ser visto como nada más que una
máquina: Sin vida, sin espíritu, sin alma, sin eternidad, sin inmaterialismo, sin impredictibilidad. La res extensa
de Descartes, o mundo material, respondió sólo a esa inquietud. La res extensa fue descrita según el trabajar
como una máquina, seguir reglas mecánicas y permitir explicaciones en términos de arreglo y movimiento de
sus partes constituyentes. El progreso científico llegó a ser más fácil a causa de lo que excluyó. Lo que
requirió la Revolución Científica, fue provisto por la desunión de Descartes. La naturaleza se volvió una
máquina perfecta, gobernada por leyes matemáticas que fueron aumentando dentro de la comprensión del
entendimiento y control humanos, y lejos de las cosas que los seres humanos no pueden controlar.
Newton, por supuesto, es el padre de muchas de las leyes que aún gobiernan nuestro entendimiento y
universo hoy en día. Su tercera ley de movimiento, por ejemplo, descansa en las bases de nuestras
presunciones sobre causa y efecto, y causas de accidentes: Para cada acción existe una reacción igual y
opuesta. En otras palabras, para cada causa existe un efecto equivalente, o más bien, para cada efecto, tiene
que haber una causa equivalente. Una ley como tal, si bien sea aplicable a la liberación y transferencia de
energía en sistemas mecánicos, está erróneamente enfocada al ser aplicada a fallas sociotécnicas, cuando
las pequeñas banalidades y sutilezas del trabajo normal hecho por gente normal en organizaciones normales
puede degenerar lentamente en desastres enormes, en liberaciones de energía desproporcionadamente
altas. La equivalencia de causa-consecuencia dictada por la tercera ley del movimiento de Newton, es
bastante inapropiada como modelo de accidentes organizacionales.
Adquirir control sobre un mundo material fue de crítica importancia para las personas hace quinientos años.
La tierra de inspiración y fertilidad para las ideas de Descartes y Newton puede ser entenderse en el contraste
de su tiempo. Europa estaba emergiendo de la Edad Media –tiempos de temor y fe, donde los lapsos de vida
eran segados tempranamente por guerras, enfermedad y epidemias. No deberíamos subestimar la ansiedad y
aprensión sobre la habilidad humana de enfocar sus esfuerzos contra estas míticas profecías. Luego de la
Plaga, a los habitantes de la Inglaterra nativa de Newton, por ejemplo, les tomó hasta 1650 recuperar el nivel
de 1300. La gente estaba a merced de fuerzas no apenas controlables y comprendidas como enfermedades.
En el milenio precedente, la piedad, la oración y la penitencia estaban entre los mecanismos directivos
mediante los cuales la gente podía alcanzar alguna clase de dominio sobre el mal y el desastre.
El crecimiento de la perspicacia producido por la Revolución Científica, lentamente comenzó a entregar una
alternativa, con éxito mensurable empíricamente.
La Revolución Científica entregó nuevos medios para controlar el mundo natural.
Los telescopios y microscopios le dieron a la gente nuevas formas de estudiar componentes que hasta
entonces habían sido muy pequeños o habían estado muy distantes para ser vistos por el ojo desnudo,
abriendo de pronto una visión del universo completamente nueva y por primera vez, revelando causas de los
fenómenos hasta entonces malamente comprendidos. La naturaleza no fue un monolito atemorizante,
inexpugnable, y las personas dejaron de estar sólo en el final de sus victimarios caprichos. Al estudiarla de
nuevas formas, con nuevos instrumentos, la naturaleza podría ser descompuesta, partida en trozos más
pequeños, medida y, a través de todo eso, comprendida mejor y eventualmente controlada.
Los avances en las matemáticas (geometría, álgebra, cálculo), generaron modelos que pudieron contar para y
predecir fenómenos recientemente descubiertos en, por ejemplo, medicina y astronomía. Al descubrir algunos
de los cimientos del universo y la vida, y al desarrollar matemáticas que imitan su funcionamiento, la
Revolución Científica reintrodujo un sentido de predictibilidad y control que hacía tiempo yacía durmiendo
durante la Edad Media. Los seres humanos pudieron alcanzar el dominio y la preeminencia sobre las
vicisitudes e imprevisibilidades de la naturaleza. La ruta hacia tal progreso debería venir de medir, derribar
(conocido variadamente hoy como reducir, descomponer o deconstruir) y modelar matemáticamente el mundo
a nuestro alrededor – para seguidamente reconstruirlo en nuestros términos.
La mesurabilidad y el control son temas que animaron a la Revolución Científica, y resuenan fuertemente hoy
en día. Incluso las nociones de dualismo (los mundos material y mental se encuentran separados) y la
deconstrucción (los ―todos‖ pueden ser explicados por el arreglo y la interacción de sus partes constituyentes
a bajo nivel) han sobrevivido largamente a sus iniciadores. La influencia de Descartes se juzga tan grande en
parte debido a que él la escribió en su lengua nativa, más que en latín, presumiéndose por lo tanto que amplió
el acceso y la exposición popular a sus pensamientos. La mecanización de la naturaleza desparramada por
su dualismo, y los enormes avances matemáticos de Newton y otros, lideraron siglos de progreso científico
sin precedentes, crecimiento económico y éxito de ingeniería. Como señalara Fritjof Capra (1982), la NASA
no habría tenido la posibilidad de poner un hombre en la Luna sin Rene Descartes.
La herencia, sin embargo, es definitivamente una bendición mezclada. Los Factores Humanos y la Seguridad
de Sistemas están estancados con un lenguaje, con metáforas e imágenes que enfatizan la estructura,
componentes, mecánicas, partes e interacciones, causa y efecto. Mientras nos dan la dirección inicial para
construir sistemas seguros y para figurarnos lo que estuvo mal, cuando cambia no lo hacemos nosotros, hay
límites para la utilidad de este vocabulario heredado. Regresemos a ese día de verano de 1999 y a la carrera
en pista del MD-80. En buena tradición Newtoniana-Cartesiana, podemos comenzar abriendo el avión un
poco más, separar los diversos componentes y procedimientos para ver como interactúan, segundo a
segundo. Inicialmente seremos alcanzados por el éxito empíricamente resonante – como de hecho Descartes
y Newton frecuentemente fueron. Pero cuando queremos recrear el todo en base a las partes que
encontramos, una realidad más problemática salta a la vista: Ya no todo va bien. La exacta, matemáticamente
placentera separación entre causa humana y mecánica, entre episodios sociales y estructurales, se ha
derribado. El todo ya no se ve más como una función linear de la suma de sus partes. Como explicara Scout
Snook (2000), los dos pasos clásicos occidentales de reducción analítica (el todo en partes) y síntesis
inductiva (las partes de vuelta en el todo nuevamente) parecen funcionar, pero simplemente juntando las
partes que encontramos no captura la rica complejidad oculta dentro y alrededor del incidente.
Lo que se necesita es una integración orgánica, holística. Tal vez sea necesaria una nueva forma de análisis y
síntesis, sensible a la situación total de la actividad sociotécnica organizada. Pero primero examinemos la
historia analítica, componencial.
SPOILERS, PROCEDIMIENTOS Y SISTEMAS HIDRÁULICOS
Los spoilers son esos flaps que se levantan al flujo de aire en la parte superior de las alas, luego que la
aeronave ha tocado tierra. No solo contribuyen a frenar la aeronave al obstruir la corriente de aire, sino que
además, causan que el ala pierda la capacidad de crear sustentación, forzando el peso de la aeronave en las
ruedas. La extensión de los ground spoilers acciona además el sistema de frenado automático en las ruedas.
Mientras más peso llevan las ruedas, más efectiva se vuelve su frenado. Antes de aterrizar, los pilotos
seleccionan el ajuste que desean en el sistema de frenado de ruedas automático (mínimo, medio o máximo),
dependiendo del largo y condiciones de la pista.
Luego del aterrizaje, el sistema automático de frenado de ruedas disminuirá la velocidad de la aeronave sin
que el piloto tenga que hacer algo, y sin dejar que las ruedas deslicen o pierdan tracción. Como tercer
mecanismo para disminuir la velocidad, la mayoría de los aviones jet tiene reversores de impulso, que
direccional el flujo saliente de los motores jet en contra de la corriente de aire, en vez de hacerlo salir hacia
atrás.
En este caso, no salieron los spoilers, y como consecuencia, no se accionó el sistema de frenado automático
de ruedas. Al correr por la pista, los pilotos verificaron el ajuste del sistema de frenado automático en múltiples
oportunidades, para asegurarse que se encontraba armado e incluso cambiando su ajuste a máximo, al ver
acercarse el final de la pista. Pero nunca engancharía. El único mecanismo remanente para disminuir la
velocidad de la aeronave era el empuje reverso. Los reversores, sin embargo, son más efectivos a altas
velocidades. Para el momento en que los pilotos se percataron que no iban a lograrlo antes del final de la
pista, la velocidad era ya bastante baja (ellos terminaron saliendo al campo a 10-20 nudos) y los reversores
no tenían entonces un efecto inmediato. A medida que el jet salía por el borde de la pista, el capitán cerraba
los reversores y desplazaba la aeronave algo a la derecha para evitar obstáculos.
¿Cómo se arman los spoilers? En el pedestal central, entre los dos pilotos, hay una cantidad de palancas.
Algunas son para los motores y reversores de impulso, una es para los flaps, y una para los spoilers. Para
armar los ground spoilers, uno de los pilotos debe elevar la palanca. La palanca sube aproximadamente una
pulgada y permanece allí, armada hasta el toque a tierra. Cuando el sistema censa que la aeronave está en
tierra (lo que hace en parte mediante switches en el tren de aterrizaje), la palanca regresa automáticamente y
los spoilers salen. Asaf Degani, quien estudió tales problemas procedimentales en forma extensa, ha llamado
el episodio del spoiler no como uno de error humano, sino uno de tiempo (timing) (ejemplo, Degani, Heymann
& Shafto, 1999). En esta aeronave, como en muchas otras, los spoilers no deberían ser armados antes que se
haya seleccionado el tren de aterrizaje abajo y se encuentre completamente en posición. Esto tiene que ver
con los switches que pueden indicar cuando la aeronave se encuentra en tierra. Estos son switches que se
comprimen a medida que el peso de la aeronave se asienta en las ruedas, pero no sólo en esas
circunstancias. Existe un riesgo en este tipo de aeronave, que el switch en el tren de nariz se comprima
incluso mientras el tren de aterrizaje está saliendo de su asentamiento. Ello puede ocurrir debido a que el tren
de nariz se despliega en la corriente de aire de impacto.
A medida que el tren de aterrizaje está saliendo y la aeronave se desliza en el aire a 180 nudos, la pura fuerza
del viento puede comprimir el tren de nariz, activar el switch y seguidamente arriesgar extendiendo los ground
spoilers (si se encontrasen armados). No es una buena idea: La aeronave podría tener problemas volando
con los ground spoilers fuera. Por lo tanto, el requerimiento: El tren de aterrizaje necesita estar durante todo
su recorrido hacia fuera, apuntando abajo. Sólo cuando no exista más riesgo de compresión del switch
aerodinámica, los spoilers pueden ser armados. Este es el orden de los procedimientos before-landing:
Gear down and locked.
Spoilers armed.
Flaps FULL.
En una aproximación típica, los pilotos seleccionan abajo la manivela del tren de aterrizaje cuando el llamado
glide slope se encuentra vivo: cuando la aeronave ha entrado en el rango de la señal electrónica que la guiará
hacia abajo a la pista. Una vez que el tren de aterrizaje se encuentra abajo, los spoilers deben ser armados.
Entonces, una vez que la aeronave captura ese glide slope (por ejemplo, está exactamente en la marcación
electrónica) y comienza a descender en la aproximación a la pista, los flaps necesitan ser ajustados a FULL
(típicamente 40º). Los flaps son otros aparatos que se extienden desde el ala, cambiando su forma y tamaño.
Ellos permiten a la aeronave volar más lento para un aterrizaje. Esto condiciona los procedimientos al
contexto. Ahora se ve así:
Gear down and locked (cuando el glide slope esté vivo).
Spoilers armed (cuando el tren esté abajo y asegurado).
Flaps FULL (cuando el glide slope esté capturado).
¿Pero cuánto toma pasar desde ―glide slope vivo‖ a ―glide slope capturado‖? En una típica aproximación
(dada la velocidad) esto toma alrededor de 15 segundos.
En un simulador, donde toma lugar el entrenamiento, esto no crea problema. El ciclo completo (desde la
palanca del tren abajo hasta la indicación ―gear down and locked‖ en la cabina), toma alrededor de 10
segundos.
Eso deja 5 segundos para armar los spoilers, antes que la tripulación necesite seleccionar flaps FULL (el ítem
siguiente en los procedimientos). En el simulador, entonces, las cosas se ven como esto:
En t = 0 Gear down and locked (cuando el glide slope esté vivo).
En t + 10 Spoilers armed (cuando el tren esté abajo y asegurado).
En t + 15 Flaps FULL (cuando el glide slope esté capturado).
Pero en una aeronave real, el sistema hidráulico (que, entre otras cosas, extiende el tren de aterrizaje), no es
tan efectivo como en un simulador. El simulador, desde luego, solo simula los sistemas hidráulicos de la
aeronave, modelado en como se encuentra la aeronave cuando tiene cero horas voladas, cuando está
reluciente de nuevo, salido de fábrica. En una aeronave más vieja, puede tomar hasta medio minuto al tren
realizar el ciclo y quedar asegurado. Ello hace que los procedimientos se vean algo así:
En t = 0 Gear down and locked (cuando el glide slope esté vivo).
En t + 30 Spoilers armed (cuando el tren esté abajo y asegurado).
¡Pero! en t + 15 Flaps FULL (cuando el glide slope esté capturado).
En efecto, entonces, el ítem ―flaps‖ en los procedimientos molesta antes del ítem ―spoilers‖. Una vez que el
ítem ―flaps‖ está completo y la aeronave desciende hacia la pista, es fácil continuar con los procedimientos
desde allí, con los ítems siguientes. Los spoilers nunca se arman. Su armado ha caído entre los quiebres de
una combadura de tiempo.
Una exclusiva declaración de error humano (o quiebre de CRM), se vuelve más difícil de sostener frente a
este trasfondo. ¿Qué tanto error humano hubo, en verdad? Permanezcamos dualistas por ahora y volvamos a
visitar la Ecuación 1. Ahora apliquemos una definición más liberal de falla mecánica. El tren de nariz de la
aeronave real, ajustado con un switch de compresión, está diseñado de forma tal que se pueda desplegar en
el viento mientras se vuela. Esto introduce una vulnerabilidad mecánica sistemática que solamente es
tolerada mediante pausas procedimentales (un mecanismo de conocidos agujeros contra la falla): primero el
tren, luego los spoilers. En otras palabras, ―gear down and locked‖ es un prerrequisito mecánico para el
armado de los spoilers, pero el ciclo completo del tren puede tomar más tiempo del figurado en los
procedimientos y las pausas de eventos que dirigen su aplicación. El sistema hidráulico de los viejos jets no
presuriza tan bien: Puede tomar hasta 30 segundos para un tren de aterrizaje realizar el ciclo hacia fuera. El
simulador de vuelo, en contraste, realiza el mismo trabajo dentro de 10 segundos, dejando una sutil pero
sustantiva incongruencia. Una secuencia de trabajo es introducida y practicada durante el, mientras que una
delicadamente diferente es necesaria para las operaciones reales. Más aún, esta aeronave tiene un sistema
que advierte si los spoilers no están armados en el despegue, pero no tiene un sistema para advertir que los
spoilers no están armados en la aproximación. Entonces ahí está el arreglo mecánico en el cockpit. La
palanca de spoiler armado luce diferente de la de spoiler no armado sólo por una pulgada y un pequeño
cuadrado rojo en el fondo. Desde la posición del piloto en el asiendo derecho (quien necesita confirmar su
armado), este parche rojo se oscurece detrás de las palancas de potencia mientras estas se encuentran en la
posición típica de aproximación. Con tanta contribución mecánica alrededor (diseño del tren de aterrizaje,
sistema hidráulico erosionado, diferencias entre el simulador y la aeronave real, distribución de las palancas
del cockpit, falta de un sistema de advertencia de los spoilers durante la aproximación, pausas en los
procedimientos) y una contribución de planificación estocástica (dos capitanes en este vuelo), una falla
mecánica de mucho mayor magnitud podría ser adherida a la ecuación para rebalancear la contribución
humana.
Pero eso todavía es dualista. AL reensamblar las partes que encontramos entre procedimientos, pausas,
erosión mecánica, intercambios de diseño, podemos comenzar a preguntar donde realmente terminan las
contribuciones mecánicas, y donde comienzan las contribuciones humanas. La frontera ya no está tan clara.
La carga impuesta por un viento de 180 nudos en la rueda de nariz se transfiere a un débil procedimiento:
primero el tren, luego los spoilers. La rueda de nariz, desplegándose al viento y equipada con un switch de
compresión, es incapaz de acarrear esa carga y garantizar que los spoilers no se extenderán, por lo que en su
lugar, un procedimiento tiene que llevar la carga. La palanca del spoiler está ubicada en una forma que hace
difícil su verificación, y un sistema de advertencia para spoilers no armados no se encuentra instalado.
Nuevamente, el error está suspendido, inestable, entre la intención humana y el hardware de ingeniería –
pertenece a ambos y a ninguno únicamente. Y entonces está esto: El desgaste gradual de un sistema
hidráulico no es algo que haya sido tomado en cuenta durante la certificación del jet. Un MD-80 con un
sistema hidráulico anémico que toma más de medio minuto para llevar todo el tren fuera, abajo y asegurado,
violando el requerimiento de diseño original por un factor de tres, aún se considera aeronavegable.
El sistema hidráulico desgastado no puede ser considerado una falla mecánica. No deja al jet en tierra. Ni
tampoco lo hace la palanca del spoiler de difícil verificación, ni la falta de un sistema de advertencia durante la
aproximación. El jet fue certificado como aeronavegable con o sin todo ello. Que no haya falla mecánica, en
otras palabras, no es porque no existan asuntos mecánicos.
No existe falla mecánica por los sistemas sociales, hechos por los fabricantes, reguladores, y operadores
prospectivos – formados indudablemente por preocupaciones prácticas y expresados a través de juicio de
ingeniería situado con incertidumbre sobre el desgaste futuro – decidieron que ahí no podría haber ninguno (al
menos no relacionado con los asuntos ahora identificados en la corrida de un MD-80). ¿Dónde termina la falla
mecánica y comienza el error humano? Al excavar sólo lo suficientemente profundo, la pregunta se vuelve
imposible de responder.
RES EXTENSA Y RES COGITANS, ANTIGUO Y NUEVO
Separar res extensa de res cogitans, como hizo Descartes, es artificial. No es el resultado de procesos o
condiciones naturales, sino más bien una imposición de una visión global. Esta visión global, debido
inicialmente al progreso científico acelerante, está comenzando a estorbar seriamente en nuestro
entendimiento. En los accidentes modernos, las causas mecánicas y humanas están desenfocadas. La
disyunción entre mundos materiales y mentales, y el requerimiento para describirlas diferente y
separadamente, están debilitando nuestros esfuerzos para comprender el éxito y la falla sociotécnicos.
La distinción entre las visiones nueva y antigua del error humano, la que también fue hecha antes en ―Field
Guide to Human Error Investigations” (Dekker, 2002) realmente se conduce ásperamente sobre esas
sutilezas. Revisemos cómo la investigación en el incidente de la corrida sobre pista encontró ―quiebres en
CRM‖ como un factor causal. Este es el pensamiento de la visión antigua. Alguien, en este caso un piloto, o
más bien una tripulación de dos pilotos, olvidó armar los spoilers. Este fue un error humano, una omisión. Si
ellos no hubiesen olvidado armar los spoilers, el accidente no habría ocurrido, fin de la historia. Pero tal
análisis de la falla no se prueba bajo las variables superficiales inmediatamente visibles de una secuencia de
eventos. Como Perrow (1984) señaló, sólo juzga dónde la gente debió hacer ―zig‖, en vez de hacer ―zag‖. La
vieja visión del error humano es sorprendentemente común. En la visión antigua, el error – por cualquiera de
sus nombres (ej.: complacencia, omisión, quiebre de CRM) – es aceptado como una explicación satisfactoria.
Esto es lo que la nueva visión del error humano trata de evitar. Ella ve al error humano como una
consecuencia, como el resultado de fallas y problemas más profundos dentro de los sistemas en que las
personas trabajan. Se resiste a ver al error humano como la causa. Por sobre juzgar a las personas por no
hacer lo que debieron hacer, la nueva visión presenta herramientas para explicar por qué las personas
hicieron lo que hicieron. El error humano se vuelve un punto de partida, no una conclusión. En el caso
―spoiler‖, el error es un resultado de intercambios de diseño, erosión mecánica, vulnerabilidades de los
procedimientos y estocástica operacional. Por cierto, el compromiso de la nueva visión es resistir las
ordenadas y condensadas versiones en las que la elección humana o una parte mecánica fallada, guió a la
estructura completa en el camino a la perdición. La distinción entre las visiones antigua y nueva es importante
y necesaria. Sin embargo, incluso en la nueva visión el error es todavía un efecto, y los efectos son el
lenguaje de Newton. La nueva visión implícitamente concuerda con la existencia, la realidad del error. Se ve al
error como algo que está allá afuera, en el mundo, y causado por algo más, también allá fuera, en el mundo.
Como muestran los capítulos siguientes, tal (ingenua) posición realista es tal vez insostenible.
Volvamos a cómo el universo Newtoniano-Cartesiano consta de ―todos‖ que pueden ser explicados y
controlados al quebrarlos en partes constituyentes y sus interconexiones (por ejemplo, humanos y máquinas,
bordes suaves y bordes agudos, culturas de seguridad y culturas de culpa).
Los sistemas están hechos de componentes, y de lazos de tipo mecánico entre esos componentes. Esto
descansa en la fuente de la elección entre causa humana y causa material (¿es error humano o falla
mecánica?). Es Newtoniano en que se busque una causa para cualquier efecto observado, y Cartesiano en su
dualismo. De hecho, ello expresa tanto el dualismo de Descartes (ya sea mental o material: Usted no puede
mezclar los dos) y la noción de descomposición, donde las propiedades e interacciones de orden bajo
determinan completamente a todos los fenómenos. Ellos son suficientes; usted no necesita más. El analizar
qué bloques constituyentes van hacia el problema, y como ellos se suman, es necesario y suficiente para
comprender por qué ocurren los problemas. La ecuación 1 es un reflejo de la suficiencia explicatoria asumida
por las propiedades de orden bajo. Agregue las contribuciones individuales, y se desenrollará la respuesta a
por qué ocurrió el problema. Una corrida de una aeronave sobre la pista hoy puede ser entendida al partir las
contribuciones en causas humanas y mecánicas, analizando las propiedades e interacciones de cada una de
ellas y entonces reensamblándolas de vuelta en un ―todo‖.
―Error humano‖ aparece como la respuesta.
Si no existen contribuciones materiales, se espera que la contribución humana acarree la carga explicatorio
completa. Tan pronto como se alcance el progreso utilizando esta visión mundial, no hay razón para
cuestionarla. En varias esquinas de la ciencia, incluyendo factores humanos, muchas personas aún no ven
razón para hacerlo. De hecho, no hay razón para que los modelos estructuralistas no puedan ser impuestos
en el desordenado interior de los sistemas sociotécnicos. Que estos sistemas, sin embargo, revelen
propiedades tipo máquinas (componentes e interconexiones, capas y agujeros) cuando los abrimos ―post
mortem‖ no significa que ellos sean máquinas, o que ellos, en vida, hayan crecido y se hayan comportado
como máquinas. Como Leveson (2002) señaló, la reducción analítica asume que la separación de un todo en
partes constituyentes es practicable, que los subsistemas operen independientemente, y que los resultados
de análisis no se distorsionen por separar el todo en partes. Esto, en cambio, implica que los componentes no
están sujetos a loops de retroalimentación y otras interacciones no lineares, y que ellas son esencialmente las
mismas al ser examinadas en forma singular o al formar parte en el todo. Mas aún, esto asume que los
principios que gobiernan el ensamble de los componentes en el todo van directo hacia delante; las
interacciones entre componentes son simplemente suficientes para ser consideradas separadas del
comportamiento del todo. ¿Son válidas estas presunciones cuando tratamos de comprender los accidentes
sistémicos? Los próximos capítulos nos dan que pensar. Tomemos por ejemplo el reto de la deriva hacia la
falla y la naturaleza evasiva de los accidentes que ocurren sobre un nivel de seguridad de 10
-7
. Esos
accidentes no ocurren debido sólo a causa de fallas de sus componentes, sin embargo, nuestros modelos
mecánicos del funcionamiento organizacional o humano nunca pueden capturar los procesos orgánicos,
relacionales que gradualmente empujan a un sistema sociotécnico hacia el borde del quiebre. Al observar las
fallas de los componentes, tales como los ―errores humanos‖, que muchos métodos de categorización popular
buscan, pueden ser fraudulentas en su ilusión acerca de lo que nos dicen sobre la seguridad y el riesgo en
sistemas complejos. Hay un consenso creciente sobre que nuestros esfuerzos y modelos en vigencia serán
incapaces de romper con la asíntota, el fuera de nivel, en nuestro progreso en la seguridad más allá de 10
-7
.
¿Es la visión de sistemas sociotécnicos estructuralista y mecánica, donde vemos componentes y lazos y sus
fallas, apropiada aún para hacer un real progreso?
Capítulo 2
¿Por qué fallan los sistemas seguros?
Los accidentes en realidad no ocurren muy a menudo. La mayoría de los sistemas de transporte en el mundo
desarrollado son seguros, o incluso ultra seguros. La usanza de un accidente fatal es menor a 10
-7
, lo que
significa una posibilidad de muerte, pérdida seria de propiedad, o devastación económica o medioambiental
de uno a 10.000.000 (Amalberti, 2001). Al mismo tiempo, esto aparece como una frontera mágica. Ningún
sistema de transporte ha encontrado una forma de ser aún más seguro.
El progreso en la seguridad más allá de 10
-7
es evasivo. Como ha señalado Rene Amalberti, las extensiones
lineares de los esfuerzos en materias de seguridad vigentes (reporte de incidentes, gestión de seguridad y
calidad, verificación de eficiencia, estandarización y procedimentalización, más reglas y regulaciones) se ven
como de pequeño uso para quebrar la asíntota, incluso si son necesarios para mantener el nivel de seguridad
de 10
-7
.
Aún más intrigante, los accidentes que ocurren en esta frontera parecen ser de un tipo difícil de predecir
utilizando la lógica que gobierna el pensamiento de seguridad hasta 10
-7
. Es aquí que las limitaciones de un
vocabulario estructuralista se vuelven más aparentes. Los modelos de accidente que se relacionan
ampliamente en fallas, agujeros, violaciones y deficiencias, pueden tener un momento difícil al acomodar
accidentes que parecen surgir de (lo que parece que a todos les gusta) gente normal, realizando trabajo
normal en organizaciones normales. Sin embargo, el misterio es que en las horas, días, o incluso años
previos a un accidente más allá de 10
-7
, pudo haber algo de reporte de fallas o notoriedad de las deficiencias
organizacionales. Los reguladores, así como también el personal interno, típicamente no ven personas
violando reglas, ni tampoco descubren otras fallas que pudieran dar causa para finalizar o reconsiderar
seriamente las operaciones. Si sólo fuera así de simple. Y hasta es probable en 10
-7
. Pero cuando las fallas,
fallas serias, ya no están precedidas de fallas serias, la predicción de accidentes se vuelve mucho más difícil.
Y modelarlos con la ayuda de nociones mecánicas, estructuralistas puede ser de poca ayuda.
El mayor riesgo residual en los sistemas sociotécnicos seguros de hoy en día es la deriva hacia la falla. La
deriva hacia la falla es como un movimiento lento e incremental de las operaciones de sistemas hacia el borde
de sus envoltorios de seguridad. Las presiones de escasez y competencia típicamente alimentan esta deriva.
Tecnología no acertada y conocimiento incompleto sobre dónde están realmente los límites, resulta en que las
personas no detengan la deriva o incluso no la perciban. El accidente del Alaska Airlines 261 del año 2000 es
altamente instructivo en este sentido. El MD-80 se estrelló en el océano fuera de California luego que se
rompiera el sistema de trim en su cola. En la superficie, el accidente parece ajustarse a la simple categoría
que ha venido a dominar las estadísticas recientes de accidentes: fallas mecánicas como resultado de
mantenimiento pobre: Un componente particular falló a causa de que las personas no lo mantuvieron
adecuadamente. De hecho, hubo una falla catastrófica de un componente particular. Una falla mecánica, en
otras palabras. El quiebre volvió incontrolable a la aeronave inmediatamente, y la envió en picada hacia el
Pacífico. Pero tales accidentes no ocurren sólo porque alguien súbitamente erra o algo súbitamente se
quiebra: Se supone que hay demasiada protección construida contra los efectos de fallas particulares.
¿Qué tal si estas estructuras protectoras contribuyan en sí mismas a la deriva, de algunas formas inadvertida,
desapercibida y difícil de detectar? ¿Que tal si la complejidad social organizada que rodea a la operación
tecnológica, todos los comités de mantenimiento, grupos de trabajo, intervenciones regulatorias, aprobaciones
e inputs del fabricante, que se supone deben proteger al sistema del quiebre, en realidad hayan contribuido a
fijar su curso hacia el borde del envoltorio?
Desde ―Man-Made Disasters‖, de Barry Turner, 1978, sabemos explícitamente que los accidentes están
incubados en sistemas complejos y bien protegidos. El potencial para un accidente se acumula en el tiempo,
pero esta acumulación, esta firme deslizada hacia el desastre, generalmente pasa irreconocida por aquellos
en el interior, e incluso por aquellos en el exterior. Así que, el Alaska 261 no sólo es una falla mecánica,
incluso si eso es lo que mucha gente pudiera querer ver como eventual resultado (y causa aproximada del
accidente). El Alaska 261 es un poco de desacierto tecnológico, algo de adaptaciones graduales y algo de
deriva hacia la falla. Es acerca de las influencias mutuas e inseparables entre el mundo mecánico y social, y
deja completamente expuesto lo inadecuado de nuestros modelos vigentes en factores humanos y seguridad
de sistemas.
PERNOS Y JUICIOS DE MANTENIMIENTO.
En el Alaska 261, la deriva hacia el accidente que ocurrió el 2000, había comenzado décadas antes. Se
remonta hasta los primeros vuelos del Douglas DC-9, que precedió al tipo MD-80. Como (casi) todas las
aeronaves, este tipo tiene un estabilizador horizontal (o
plano de cola, una pequeña ala) en la parte posterior, que contribuye a dirigir la sustentación creada por las
alas. Este pequeño plano de cola es el que mantiene arriba la nariz de una aeronave: Sin el, no es posible el
vuelo controlado (ver Fig. 2.1). El plano de cola en sí mismo, puede angular hacia arriba o abajo para inclinar
la nariz hacia arriba o hacia abajo (y consecuentemente, hacer que al avión vaya hacia arriba o hacia abajo).
En la mayoría de las aeronaves, el sistema de compensación puede ser dirigido por el piloto automático y por
inputs de la tripulación. El plano de cola está engoznado en la parte posterior, mientras que el extremo
delantero se arquea hacia arriba o hacia abajo (también tiene superficies de control en la parte posterior que
están conectadas a la columna de control en el cockpit, pero ellas no están tratadas aquí).
Fig. 2.1. Ubicación del estabilizador horizontal en un avión tipo MD-80.
Estabilizador Horizontal
La presión en el extremo frontal del estabilizador horizontal arriba o abajo, es ejercida mediante un perno
giratorio y una tuerca.
La estructura completa trabaja un poco como una gata de automóvil utilizada para levantar un vehículo: Por
ejemplo, al cambiar un neumático. Usted gira y el perno rota, empujando hacia adentro las tuercas de la cima,
por así llamarlas, y levantando el auto (ver Fig. 2.2).
En el sistema del trim del MD-80, el extremo frontal del estabilizador horizontal está conectado a una tuerca
que dirige un perno vertical hacia arriba y hacia abajo. Un motor de trim eléctrico rota el perno, el que a su vez
sube o baja la tuerca. La tuerca entonces, empuja al estabilizador horizontal completo hacia arriba o hacia
abajo. La lubricación adecuada es crítica para el funcionamiento de un montaje de perno y tuerca. Sin grasa
suficiente, el roce constante destruirá el hilo en la tuerca o bien en el perno (en este caso, el perno está hecho
deliberadamente de un material más duro, dañándose primero el hilo de la tuerca). El hilo en realidad lleva la
carga completa que se impone en el timón vertical durante el vuelo. Es una carga de alrededor de 5000 libras,
similar al peso de una van familiar completa colgando por el hilo de un montaje de perno y tuerca. Donde se
salga el hilo en un MD-80, la tuerca no podrá atrapar los hilos del perno. Las fuerzas aerodinámicas entonces,
forzarán al plano horizontal de cola (y la tuerca) a su detención fuera del rango normal, volviendo a la
aeronave incontrolable en el eje de pitch, que es esencialmente lo que le ocurrió al Alaska 261.
Incluso la detención falló a causa de la presión. Una suerte de tubo de torque corrió a través del perno
para generar redundancia
FIG. 2.2. Trabajo simplificado del mecanismo de trim de la cola de un avión tipo MD-80. El estabilizador
horizontal está engoznado en le parte posterior y conectado al perno mediante la tuerca. El
estabilizador es desplazado arriba y abajo por rotación del perno.
(en vez de tener dos pernos, como en el modelo DC-8, que lo precedía).
Pero incluso el tubo de torque falló en el Alaska 261.
Desde luego, se suponía que nada de esto debía ocurrir. En el primer lanzamiento de la aeronave, a
mediados de 1960, Douglas recomendó que los operadores lubricaran el perno del ensamble del tren cada
300 a 350 horas de vuelo. Para el uso comercial típico, eso podía significar dejar el avión en tierra por tal
mantenimiento cada unas pocas semanas. Inmediatamente, los sistemas organizacionales y sociotécnicos
alrededor de la operación de la tecnología comenzaron a adaptarse, y fijaron el sistema en su curso hacia la
deriva. Por medio de una variedad de cambios y desarrollos en la guía de mantenimiento para las aeronaves
de series DC-9/MD-80, se extendió el intervalo en la lubricación. Como vemos más tarde, estas extensiones
difícilmente fueron producto de las solas recomendaciones del fabricante, si es que lo fueron. Una red mucho
más compleja y de constante evolución, de comités con representantes de reguladores, fabricantes,
subcontratistas y operadores fue el corazón de un desarrollo fragmentado y discontinuo de niveles de
mantenimiento, documentación y especificaciones.
Rangoderecorridodetrimhorizontal(exagerado)
Dirección del vuelo
Estabilizador horizontal Elevador
BisagrasTuerca
Perno
Parada inferior
La racionalidad de las decisiones sobre los intervalos de mantenimiento fue producida en forma relativamente
local, relacionada con una información emergente incompleta sobre lo que era, por todos sus engaños
básicos y tecnología no acertada. Si bien cada decisión fue localmente racional, teniendo sentido para
quienes tomaron las decisiones en su tiempo y lugar, el cuadro global se volvió uno de deriva hacia el
desastre, deriva significante.
Comenzando con un intervalo de lubricación de 300 horas, el intervalo al momento del accidente del Alaska
261 se había movido hasta 2.550 horas, casi un orden de mayor magnitud. Como es típico en la deriva hacia
la falla, esta distancia no fue sorteada en un intento. El desliz fue incrementando: paso a paso, decisión por
decisión. En 1985, la lubricación del perno fue para cumplirse cada 700 horas, cada vez que se realizara el
llamado mantenimiento B check (que ocurre cada 350 horas de vuelo). En 1987, el intervalo del B check fue
incrementado a 500 horas de vuelo, forzando los intervalos de lubricación a 1.000 horas. En 1988, fueron
eliminados todos los B checks, y las tareas a ser cumplidas se distribuyeron sobre los checks A y C. La
lubricación de la estructura del perno fue para cumplirse cada octavo A check de 125 horas: aún cada 1.000
horas de vuelo. Pero, en 1991, los intervalos de A check fueron extendidos a 150 horas, dejando la lubricación
para cada 1.200 horas. Tres años más tarde, el intervalo del A check fue extendido nuevamente, esta vez
para 200 horas. La lubricación ocurriría ahora cada 1.600 horas de vuelo. En 1996, la tarea de lubricación de
la estructura del perno fue removida del A check y trasladada a una especie de carta de tarea que
especificaba la lubricación cada 8 meses. Ya no estaba acompañada por un límite de horas de vuelo. Para
Alaska Airlines, 8 meses se tradujeron a alrededor de 2.550 horas de vuelo.
Fig. 2.3. Deriva hacia la falla durante décadas: El intervalo de lubricación del perno se fue
extendiendo gradualmente (casi en un factor de 10), hasta el accidente del Alaska Airlines 261.
Sin embargo, el perno recuperado del fondo oceánico no revelaba ninguna evidencia de haber tenido una
adecuada lubricación en el intervalo previo. Pudo haber habido más de 5.000 horas desde que recibiera una
capa de grasa fresca (ver Fig. 2.3.).
Con tanta lubricación como había sido recomendada originalmente, Douglas pensó que no había razón para
preocuparse por un daño en el hilo. Así que antes de 1867, el fabricante proveyó o recomendó no chequear el
hilo de la estructura del perno. Se suponía que el sistema de trim debía acumular 30.000 horas de vuelo antes
que necesitara reemplazo. Pero la experiencia operacional reveló un cuadro diferente.
300
700
1000
1200
1600
2550
5000
Deriva en los intervalos de lubricación del perno
Años
1966 1985 1988 1991 1994 1996 2000
Horasdevuelo
1 año después de
la certificación
Intervalo posible
para accidente de
la aeronave
Luego que el DC-9 volara solo un año, Douglas recibió reportes de desgastes en el hilo de la estructura del
perno, significativamente excesivos respecto de lo que había sido previsto. En respuesta, el fabricante
recomendó que los operadores realizaran un ―end-play check‖ en la estructura del perno, en cada C check, o
cada 3.600 horas de vuelo. El end-play check utiliza una fijación limitante que pone presión a la estructura del
perno, simulando la carga aerodinámica durante el vuelo normal. La cantidad de juego entre perno y tuerca,
medida en milésimas de pulgada, puede ser leída entonces por un instrumento. EL juego es una medida
directa de la cantidad de desgaste del hilo.
Desde 1985 en adelante, los ―end-play checks‖ en Alaska estuvieron sujetos al mismo tipo de deriva que los
intervalos de lubricación. En 1985, los end-play checks fueron programados para cada C check por medio,
mientras que los C checks requeridos consistentemente venían en alrededor de 2.500 horas, lo que estaba
más bien adelantado de las 3.600 horas de vuelo recomendadas, dejando el avión en tierra innecesariamente.
Al programar un ―end-play check‖ cada C check por medio, sin embargo, el intervalo fue extendido a 5.000
horas.
En 1988, los intervalos entre C checks fueron extendidos a 13 meses, sin ser acompañados por un límite de
horas de vuelo. Los end-play checks fueron ahora efectuados cada 26 meses, o alrededor de cada 6.400
horas. En 1996, los intervalos entre C checks fueron extendidos nuevamente, esta vez a 15 meses. Esto
estiró las horas de vuelo entre end-play tests para alrededor de 9.550 horas. El último end-play check del
avión del accidente fue realizado en la instalación de mantenimiento de la aerolínea en Oakland, California, en
1997. Para entonces, el juego entre el perno y la tuerca fue encontrado exactamente en el límite de
0,040pulgadas. Esto introdujo una considerable incertidumbre. Con el juego en el límite permisible, ¿qué
hacer? ¿Liberar la aeronave y realizar los cambios en la próxima oportunidad, o reemplazar las partes ahora?
Las reglas no estaban claras. La llamada AOL 9-48A decía que ―las estructuras de pernos podrían
permanecer en servicio tanto como la medida end-play se mantuviera dentro de las tolerancias (entre 0,003 y
0,040 pulgadas)‖ (Nacional Transportation Safety Board, o NTSB, 2002; p. 29). Todavía estaba en 0,040
pulgadas, así que técnicamente la aeronave podría permanecer en servicio. ¿O no podría? ¿Qué tan rápido
se produciría el desgaste del hilo desde allí en adelante? Luego de seis días, numerosos cambios de posición
y otro end-play check más favorable, la aeronave fue liberada. No se reemplazaron partes: Ni siquiera había
en stock en Oakland. La aeronave ―partió a las 03:00 horas local. Hasta entonces muy bien‖, señalaba el plan
de rotación de cambio mortal (NTSB, 2002, p. 53). Tres años más tarde, el sistema de trim se fracturó y la
aeronave desapareció en el océano no muy lejos. Entre las 2.500 y las 9.550 horas hubo más deriva hacia la
falla (ver Fig. 2.4).
Nuevamente cada extensión tenía sentido local, y fue únicamente un incremento mayor a la norma
establecida previamente. No se violaron normas, no se quebrantaron leyes. Incluso el regulador estuvo de
acuerdo con los cambios en los intervalos de end-play check. Era gente normal realizando trabajo normal
alrededor de una tecnología notablemente estable y normal.
Las figuras de deriva hacia la falla son fáciles de dibujar en retrospectiva. También son fascinantes de
observar. Sin embargo, las realidades que representan no generaron una persuasión similar que aquellas en
el interior del sistema en ese momento. ¿Por qué podría haber sido notoria esta degeneración numérica,
estos números de doble chequeo y servicio? Como una indicación, nunca ser requirió a los técnicos de
mantenimiento de MD-80 registrar o hacer un seguimiento del end-play en los sistemas de trim que ellos
midieron. Ni siquiera el fabricante ha expresado interés en ver estos números o la lenta y paulatina
degeneración que ellos pudieran haber revelado. Si allí hubo deriva, en otras palabras, no podría saberlo una
memoria organizacional o institucional.
Los cuadros de deriva lo revelan. Pero no arrojan luz acerca del porqué. De hecho, la más grande
controversia desde Turner (1978) ha sido elucubrar por qué el desliz hacia el desastre, tan fácil de ver y
diagramar en retrospectiva, no la notan aquellos que se la auto inflingen. Juzgar que hubo una falta de
previsión, después del hecho, es fácil: Todo lo que se necesita es diagramar los números y percatarse del
desliz hacia el desastre. Parado en medio de las ruinas, es fácil maravillarse sobre que tan desorientada o
desinformada debió haber estado la gente. Pero ¿por qué las condiciones conducentes a un accidente nunca
fueron acordadas o ejecutadas por aquellos en el interior del sistema – aquellos quienes su trabajo era que no
ocurrieran tales accidentes? Mirar hacia delante no es mirar hacia atrás. Existe una profunda revisión del
interior que cambia en el presente. Convierte el otrora vago y poco probable futuro en un inmediato y cierto
pasado.
El futuro, dijo David Woods (1993)
FIG. 2.4. Más deriva hacia la falla: el intervalo de end-play check (que mide el daño al hilo de la
estructura perno-tuerca), fue estirado desde 3.600 a 9.550 horas de vuelo.
se ve implausible antes de un accidente (―No, eso no nos va a ocurrir‖). Pero luego de un accidente, el pasado
parece increíble (―¡Cómo no pudimos percatarnos que esto nos iba a ocurrir!‖). Lo que ahora parece
extraordinario, fue ordinario una vez. Las decisiones, intercambios, preferencias y prioridades que se ven tan
fuera de lo ordinario e inmorales luego de un accidente, fueron una vez normales y de sentido común para
aquellos que contribuyeron a su incubación.
BANALIDAD, CONFLICTO E INCREMENTALISMO.
La investigación sociológica (ej. Perrow, 1984; Vaughan, 1996; Weick, 1995), así como también el trabajo
presente en factores humanos (Rasmussen & Svedung, 2000) y la investigación en seguridad de sistemas
(Leveson, 2002), ha comenzado dibujar los contornos de respuestas sobre el por qué de la deriva. A pesar de
ser diferente en el fondo, pedigree y muchos detalles sustantivos, estos trabajos convergen en comuniones
importantes acerca de la deriva hacia la falla. El primero es que los accidentes y la deriva que los precede,
están asociados a gente normal realizando trabajo normal en organizaciones normales – no con los banales
atractivos de la desviación inmoral.
Podemos llamar a esto la tesis de banalidad de los accidentes.
Segundo, la mayoría de los trabajos tienen en su corazón un modelo de conflicto: Las organizaciones que
incorporan trabajo crítico de seguridad esencialmente están tratando de reconciliar metas irreconciliables
(permanecer seguro y permanecer en los negocios). Tercero, la deriva hacia la falla es incremental. Los
accidentes no ocurren súbitamente, ni se encuentran precedidos por decisiones monumentalmente malas o
inmensos pasos lejos de las normas imperantes.
La tesis de banalidad de los accidentes dice que el potencial para tener un accidente como un producto
normal de realizar negocios normales bajo presiones normales de escasez de recursos y competencia.
3600
5000
6400
9550
Deriva en los intervalos de end-play check
Años
1966 1985 1988 1996 2000
Horasdevuelo
1 año después de
la certificación
Intervalo al momento
del accidente
Ningún sistema es inmune a las presiones de escasez y competencia, bueno, casi ninguno, El único sistema
que alguna vez se ha aproximado al trabajo en un universo de recursos ilimitados fue la NASA, durante los
primeros años del Apollo (un hombre tenía que ser puesto en la luna, cualquiera fuera el costo). Ahí había
abundancia de dinero y de talento altamente motivado. Pero incluso aquí hubo tecnología no acertada, faltas y
fallas no fuera de lo común, y se impusieron restricciones de opinión rápida y tirantemente. Los recursos
humanos y el talento comenzaron a drenarse hacia fuera. De hecho, incluso las empresas no comerciales
conocen la escasez de recursos: Agencias de gobierno como la NASA o reguladores de seguridad pueden
carecer de financiamiento adecuado, personal o capacidad para hacer lo que necesitan hacer. Con respecto
al accidente del Alaska 261, por ejemplo, un nuevo programa regulatorio de inspección, llamado el ―Air
Transportation Oversight System (ATOS)‖, fue puesto en uso en 1998 (2 años antes). Redujo drásticamente la
cantidad de tiempo que los inspectores tenían para las actividades reales de supervigilancia. Un memo de
1999 por un supervisor administrativo regulador en Seattle ofrecía algo en su interior:
No somos capaces de satisfacer las demandas de carga de trabajo apropiadamente. Alaska Airlines ha
expresado continua preocupación sobre nuestra incapacidad de servirle en una manera de tiempo. Algunas
aprobaciones en el programa han sido demoradas o cumplidas en una apresurada forma a la ―undécima
hora‖, y anticipamos que este problema se intensificará con el tiempo. Además, muchas investigaciones
administrativas… han sido demoradas como resultado de recortes de recursos. (Si el regulador) continua
operando con el limitado número de inspectores de aeronavegabilidad existente…la supervigilancia
disminuida es inminente y el riesgo de incidentes o accidentes en Alaska Airlines es intensificado. (NTSB,
2002, p. 175)
La adaptación a la presión de recursos, las aprobaciones eran demoradas o apresuradas, la supervigilancia
era reducida. Sin embargo hacer negocios bajo presiones de escasez de recursos es normal: La escasez y la
competencia son parte y paquete incluso de realizar trabajo de inspección. Pocos reguladores en cualquier
parte podrán alguna vez afirmar que tienen el tiempo adecuado y los recursos de personal para cumplir con
sus mandatos. Sin embargo, el hecho que la presión de recursos es normal, no significa que no existen
consecuencias. Desde luego, la presión encuentra vías de escape. Los supervisores escriben memos, por
ejemplo. Se pelean batallas sobre los recursos. Se hacen intercambios. La presión se expresa a sí misma en
las discusiones políticas sobre recursos y primacía, en preferencias gerenciales para ciertas actividades e
inversiones sobre otras, y en casi todos los tratados de ingeniería y operaciones entre extensiones y costo,
entre eficiencia y diligencia. De hecho, el trabajo exitoso bajo presiones y restricciones de recursos es una
fuente de orgullo profesional: Construir algo que es fuerte y ligero, por ejemplo, marca al experto en ingeniería
aeronáutica.
Conseguir y dar a luz a un sistema que tenga bajos costos de desarrollo y bajos costos operacionales
(típicamente estos son inversos a los otros), es el sueño de la mayoría de los inversionistas y muchos
administradores. Ser capaz de crear un programa que permita putativamente inspecciones mejores con
menos inspectores puede ganar elogios del servicio civil y oportunidades de ascenso, mientras que los
efectos colaterales negativos del programa se sienten primariamente en una lejana oficina de campo.
Sin embargo, el motor más grande de la deriva se esconde en alguna parte de este conflicto, en esta tensión
entre operar con seguridad y operar del todo, entre construir con seguridad y construir del todo. Esta tensión
entrega la energía detrás del lento y constante desencanto de práctica entre normas establecidas previamente
o restricciones de diseño. Este desencanto puede eventualmente volverse en deriva hacia la falla. A medida
que se pone en uso un sistema, este aprende, y a medida que aprende, se adapta:
La experiencia genera información que permite a las personas afinar su trabajo: El afinamiento compensa los
problemas y peligros del descubrimiento, remueve la redundancia, elimina los gastos innecesarios y expande
las capacidades.
La experiencia a menudo capacita a las personas para operar un sistema sociotécnico por un costo mucho
menor o para obtener salidas mucho mayores que las asumidas por el diseño inicial. (Starbuck & Milliken,
1988, p. 333)
Esta deriva de ajuste fino hacia los márgenes de seguridad operacional es un testimonio hacia los límites del
vocabulario de seguridad de sistemas estructuralista en boga hoy en día. Pensamos en las culturas de
seguridad como culturas de aprendizaje: culturas que están orientadas hacia el aprendizaje de eventos e
incidentes. Pero las culturas de aprendizaje no son ni únicas (ya que cada sistema abierto en un ambiente
dinámico necesariamente aprende y se adapta), ni tampoco necesariamente positivas: Starbuck y Milliken
vislumbraron cómo una organización pueda hacer un uso ―seguro‖ de la seguridad mientras se alcanzan
ganancias en otras áreas. La deriva hacia la falla no podría ocurrir sin aprendizaje. Siguiendo esta lógica, los
sistemas que son malos en el aprendizaje y malos en la adaptación, pueden bien tener menor tendencia de
deriva hacia la falla.
Un ingrediente crítico de este aprendizaje es la aparente insensibilidad para montar evidencia que, desde la
posición retrospectiva externa, podría haber mostrado cuan malos son en realidad el juicio y las decisiones.
Así es como se ve desde la posición retrospectiva externa: La retrospectiva externa ve una falla o una
previsión. Desde el interior, sin embargo, lo anormal es bastante normal, y hacer intercambios en dirección a
mayor eficiencia no es para nada inusual. Sin embargo, al hacer estas concesiones, hay un desbalance de
retroalimentación. La información de donde una decisión es efectiva para los costos o eficiente puede ser
relativamente fácil de obtener. Una hora de arribo adelantada es calculable y tiene beneficios tangibles,
inmediatos. Cuanto es o fue tomado prestado de la seguridad para alcanzar una meta, sin embargo, es
mucho más difícil de cuantificar y comparar. Si estuvo seguido por un aterrizaje seguro, aparentemente debe
haber sido una decisión segura. De manera similar, extender el intervalo de lubricación ahorra
inmediatamente una cantidad mesurable de tiempo y dinero, mientras se toma prestado del futuro de un
aparentemente problema libre de la estructura del perno. Cada éxito empírico consecutivo (la llegada
temprano todavía es un aterrizaje seguro; la estructura del perno todavía es operacional) parece confirmar
que el ajuste fino está trabajando bien: El sistema puede operar igualmente seguro, sin embargo, más
eficientemente. Como Weick (1993) señaló, sin embargo, la seguridad en esos casos puede no haber sido del
todo el resultado de las decisiones que fueron o no fueron hechas, sino más bien una variación estocástica
por las capas inferiores, que pende de una serie de otros factores, muchos difícilmente dentro del control de
aquellos que encajan en el proceso de ajuste fino.
El éxito empírico, en otras palabras, no es prueba de seguridad. El éxito pasado no garantiza la seguridad
futura. El tomar prestado más y más de la seguridad puede estar bien por un tiempo, pero nunca se sabe
cuando va a golpear. Esto llevó a Langewiesche (1998) a decir que la ley de Murphy estaba equivocada: Todo
lo que puede ir mal normalmente va bien, y entonces figuramos la conclusión equivocada.
La naturaleza de esta dinámica, este ajuste fino, esta adaptación, es incremental.
Las decisiones organizacionales que son vistas como ―malas decisiones‖ luego del accidente (incluso aquellas
que se veían como ideas perfectamente buenas para entonces), son rara vez pasos grandes, riesgosos o de
gran magnitud. Más bien, existe una sucesión de decisiones crecientemente malas, una larga y continua
progresión de pasos incrementales pequeños, que insospechadamente llevan a una organización hacia el
desastre. Cada paso lejos de la norma original que se encuentra con el éxito empírico (y no el sacrificio o
seguridad obvios) es utilizado como la próxima base desde la cual partir solo ese poquito más, de nuevo. Es
este incrementalismo el que hace que tan difícil el distingo entre lo anormal y lo normal. Si la diferencia entre
lo que ―debió haberse hecho‖ (o lo que se hizo exitosamente ayer) y lo que es hecho exitosamente hoy, es
diminuta, entonces no vale la pena rehacer o reportar sobre esta suave partida desde una norma establecida
previa. El incrementalismo es acerca de normalización continuada: Permite normalización y la racionaliza.
Deriva hacia la falla y reporte de incidentes.
¿Pueden los reportes de incidentes no revelar una deriva hacia la falla? Esto parece ser un rol natural del
reporte de incidentes, pero no es así de simple. La normalización que acompaña la deriva hacia la falla (un
end-play check cada 9.550 horas es ―normal‖, incluso aprobado por el regulador, sin importar que el intervalo
original era de 2.500 horas) compite severamente con la habilidad de los internos de la organización para
definir incidentes.
¿Qué es un incidente? Antes de 1985, fallar al realizar un end-play check cada 2.500 horas podría haber sido
considerado un incidente, y suponiendo que la organización tuviera un medio para reportarlo, podría incluso
haber sido considerada como tal. Pero por 1996, la misma desviación era normal, incluso regulada. Por 1996,
la misma falla ya no era un incidente. Y hubo mucho más. ¿Por qué reportar que la lubricación de la
estructura del perno a menudo tiene que ser hecha en la noche, en la oscuridad, fuera del hangar, de pie en el
pequeño canasto de un camión elevador, a gran altura del suelo, incluso en la lluvia? ¿Por qué reportar que
ud., como un mecánico de mantenimiento tenía que torpemente hacerse camino mediante dos pequeños
paneles de acceso que difícilmente dejaban lugar para una mano humana – dejando espacio solo para que
los ojos vieran lo que estaba ocurriendo dentro y qué tenía que ser lubricado – si es algo que se tiene que
hacer todo el tiempo? En mantenimiento, esto es trabajo normal, es el tipo de actividad requerida para obtener
que el trabajo se haga. El mecánico responsable de la última lubricación del avión accidentado le dijo a los
investigadores que el había tenido que usar una linterna de cabeza operada por baterías durante tareas de
lubricación nocturnas, a forma tal de tener sus manos libres y poder ver algo, a lo menos. Estas cosas son
normales, no vale la pena reportarlas. No califican como incidentes. ¿Por qué reportar que los end-play
checks son efectuados con una fijación restrictiva (la única en toda la aerolínea, fabricada en casa, en ningún
caso cerca de las especificaciones del fabricante), si eso es lo que se usa cada vez que se hace un end-play
check?
¿Por qué reportar que los end-play checks, ya sea en la aeronave o en la mesa de trabajo, generan medidas
que varían ampliamente, si eso es lo que ellos hacen todo el tiempo, y si es de lo que a menudo se trata el
trabajo de mantenimiento? Es normal, no es un incidente. Incluso si la aerolínea hubiera tenido una cultura de
reporte, si hubiera tenido una cultura de aprendizaje, si hubiera tenido una cultura justa de forma tal que las
personas se sintieran seguras al enviar sus reportes sin temor a retribución, estos podrían no ser incidentes
que cambiaran en el sistema. Esta es la tesis de banalidad de accidentes. Estos no son incidentes. En
sistemas 10
-7
, los incidentes no preceden accidentes. Lo hace el trabajo normal. En estos sistemas:
Los accidentes son de diferente naturaleza de aquellos que ocurren en los sistemas seguros: en este caso los
accidentes usualmente ocurren en ausencia de cualquier quiebre serio o incluso de cualquier error serio. Ellos
resultan de una combinación de factores, ninguno de los cuales puede por sí solo causar un accidente, o
incluso un accidente serio; por lo tanto, estas combinaciones permanecen difíciles de detectar y recuperar
utilizando la lógica de análisis de seguridad tradicional. Por la misma razón, reportar se vuelve menos
relevante en la previsión de desastres mayores (Amalberti, 2001, p. 112).
Incluso si fuéramos a dirigir una fuerza analítica mayor sobre nuestras bases de datos de reportes de
incidentes, esto aún podría no devengar ningún valor predictivo para los accidentes más allá de 1 (H,
simplemente debido a que los datos no están allí. Las bases de datos no contienen, en un formato visible, los
ingredientes de accidentes que ocurren más allá de 10
-7
. Aprender de los incidentes para prevenir los
accidentes más allá de 10
-7
bien podría ser imposible. Los incidentes son acerca de fallas y errores
independientes, percibidos y perceptibles por personas en el interior. Pero estos errores independientes y
fallas ya no hacen aparición en los accidentes que ocurren más allá de 10
-7
. La falla de ver adecuadamente la
parte a ser lubricada (la parte crítica para la seguridad, el punto particular, no redundante), la falla en la
adecuada y buena realización de un end-play check – nada de esto aparece en los reportes de incidentes.
Pero se estima ―causal‖ o ―contribuyente‖ en el reporte del accidente. La etiología de los accidentes en
sistemas 10
-7
, entonces, bien puede ser fundamentalmente diferente de la de incidentes, esta vez escondida
en los riesgos residuales de hacer negocios normales bajo presiones normales de escasez y competencia.
Esto significa que las llamadas hipótesis de causa común (que sostienen que los accidentes e incidentes
tienen causas comunes y que los incidentes son cualitativamente idénticos a los accidentes excepto por un
pequeño paso), es probablemente erróneo en 10
-7
y más allá:
… Reportes de incidentes como Bhopal, Flixbrough, Zeebrugge y Chernobyl demostraron que ellos no habían
sido causados por una coincidencia de fallas independientes y errores humanos. Ellos fueron el efecto de una
migración sistemática de comportamiento organizacional hacia el accidente bajo la influencia de presión hacia
efectividad-costo en un ambiente agresivo, competitivo (Rasmussen y Svedung, 2000, p. 14).
En desmedro de este acercamiento, los errores independientes y fallas aún son el retorno mayor de cualquier
investigación de accidentes hoy en día. El reporte de la NTSB de 2002, siguiendo la lógica Newtoniano-
Cartesiana, habla de deficiencias en el programa de mantenimiento de Alaska Airlines, de defectos y
descuidos regulatorios, de responsabilidades no completas, de falencias y fallas y quiebres. Por supuesto, en
retrospectiva ellos bien pueden ser sólo eso. Y encontrar falencias y fallas está bien ya que le da al sistema
algo que reparar. ¿Pero por qué nadie en ese momento vio estas supuestamente tan aparentes fallas y
falencias por lo que (en retrospectiva) eran? Aquí es donde el vocabulario estructuralista de los factores
humanos y seguridad de sistemas tradicionales es más limitado, y limitante.
Los agujeros encontrados en las capas de defensa (el regulador, el fabricante, el operador, la instalación de
mantenimiento y finalmente el técnico), son fáciles de descubrir una vez que los restos están dispersos
delante de los pies de uno. En efecto, una crítica común a los modelos estructuralistas es que ellos son
buenos para identificar deficiencias o fallas latentes, post-mortem. Sin embargo estas deficiencias y fallas no
son vistas como tal, no son fáciles de observar como tal por aquellos en el interior (¡o incluso por aquellos
relativamente en el exterior, como el regulador!) antes que ocurra el accidente. De hecho, los modelos
estructuralistas pueden capturar muy bien las deficiencias que resultan de la deriva: Ellos identifican
acertadamente las fallas latentes o los patógenos residentes en las organizaciones y pueden ubicar agujeros
en las capas de defensa. Pero la construcción de fallas latentes, si así se quiere llamar, no está modelada. El
proceso de erosión, de desgaste de las normas de seguridad, de deriva hacia los márgenes, no puede ser
capturado adecuadamente por los enfoques estructuralistas, por aquellos que son metáforas inherentes para
las formas resultantes, no modelos orientados a procesos de formación. Los modelos estructuralistas son
estáticos. Aunque los modelos estructuralista de la década de 1990 son llamados a menudo ―modelos de
sistemas‖ o ―modelos sistémicos‖, son un lejano llanto de aquello que en realidad es considerado
pensamiento de los sistemas (por ejemplo, Capra, 1982). La parte de los sistemas de modelos estructuralistas
ha estado limitada por tanto tiempo a identificar y entregar un vocabulario para las estructuras superiores
(bordes suaves) detrás de la producción de errores en el borde agudo.
La parte de los sistemas de estos modelos es un recordatorio que existe contexto, que no podemos
comprender los errores sin ir al fondo organizacional desde el que surgen. Todo esto es necesario, desde
luego, mientras que los errores aún son tomados muy a menudo como la conclusión legítima de una
investigación (solo mirar al caso del spoiler con ―quiebre en CRM‖ como causa). Pero recordar a la gente el
contexto no es sustituto para comenzar a explicar las dinámicas, los procesos incrementales sutiles que
lideran y normalizan el comportamiento eventualmente observado. Esto requiere una perspectiva diferente
para mirar el desordenado interior de las organizaciones, y un idioma diferente para arrojar adentro las
observaciones. Requiere factores humanos y seguridad de sistema para buscar vías de movimiento hacia el
pensamiento de sistemas reales, donde los accidentes son vistos como un atributo emergente de procesos de
transacciones, ecológicos, orgánicos, más que de sólo el punto final de una trayectoria a través de agujeros
en barreras de defensa. Los enfoques estructuralistas, y las reparaciones de las cosas a que nos apuntan, no
pueden contribuir mucho para realizar mayor progreso en seguridad:
Debemos ser extremadamente sensibles a las limitaciones de los remedios conocidos. Si bien la buena
gestión y el buen diseño organizacional pueden reducir los accidentes en determinados sistemas, nunca
podrán prevenirlos… Los mecanismos causales en este caso sugieren que las fallas en los sistemas técnicos
podrían ser incluso más difíciles de evitar de lo que los más pesimistas entre nosotros pudieron haber creído.
El efecto de fuerzas sociales no acordadas e invisibles en la información, conocimiento y – finalmente – en la
acción, son muy difíciles de identificar y controlas (Vaughan, 1996, p. 416).
Sin embargo, el poder explicatorio retrospectivo de los modelos estructuralistas, los hace instrumentos de
selección para aquellos que están a cargo de gestionar la seguridad. De hecho, la idea de una banalidad de
accidentes no siempre ha encontrado atracción fuera de los círculos académicos. Por una cosa, asusta. Hace
que el potencial para una falla sea rutinario, o inexorablemente inevitable (Vaughan, 1996). Esto puede hacer
a los modelos de accidente prácticamente inutilizables y administrativamente desmoralizantes.
Si el potencial para una falla está en cualquier parte, en cualquier cosa que hacemos, entonces ¿por qué
tratar de evitarlo? Si un accidente no tiene causas en el sentido tradicional, entonces ¿para qué tratar de
arreglar cualquier cosa? Tales preguntas son de hecho nihilistas, fatalistas. No es sorprendente, entonces,
que la resistencia contra el mundo posible permanezca al acecho tras sus respuestas, tomando muchas
formas. Las preocupaciones pragmáticas son orientadas hacia el control, hacia abatir las partes rotas, los
chicos malos, los violadores, los mecánicos incompetentes. ¿Por qué este técnico no realizó la última
lubricación del perno del avión del accidente como debió haberlo hecho? Las preocupaciones pragmáticas
son sobre encontrar las falencias, identificar las áreas débiles y los puntos problemáticos, y repararlos antes
que causen problemas reales. Pero esos asuntos pragmáticos no encuentran un oído simpático ni un léxico
constructivo en las miserias sobre la deriva hacia la falla, para la deriva hacia la falla es difícil señalar,
ciertamente, desde el interior.
HACIA EL PENSAMIENTO DE LOS SISTEMAS.
Si queremos comprender las fallas más allá de 10
-7
, tenemos que dejar de buscar fallas. Ya no hay fallas que
vayan a crear estas fallas – es trabajo normal. De tal forma, la banalidad de los accidentes hace su estudio
filosóficamente filistino. Cambia el objeto de examinación lejos de los oscuros lados del mal gobierno
corporativo no ético y de la humanidad, y hacia las decisiones ordinarias de personas ordinarias normales
bajo la influencia de presiones normales, ordinarias. El estudio de los accidentes es encontrado dramático o
fascinante sólo a causa del resultado potencial, no a causa de los procesos que lo incuban (los que en sí
mismos pueden ser fascinantes, desde luego).
Habiendo estudiado en extenso el desastre del Transbordador Espacial Challenger, Diane Vaughan (1996)
estuvo forzada a concluir que este tipo de accidente no es causado por una serie de fallas componentes,
incluso si el resultado son las fallas componentes. En efecto, junto con otros sociólogos, ella apuntó a un
origen no claro de la equivocación, a errores y quiebres como subproductos normales de los procesos de
trabajo de una organización:
La equivocación, el accidente y el desastre, están organizados socialmente y son producidos
sistemáticamente por estructuras sociales. No hay acciones extraordinarias de individuos que expliquen lo
ocurrido: no hay equivocaciones gerenciales intencionales, no hay violaciones a las reglas, no hay
conspiración. Estas son equivocaciones imbuidas en la banalidad de la vida organizacional y facilitada por los
ambientes de escasez y competencia, tecnología incierta, incrementalismo, patrones de información,
rutinización y estructuras organizacionales, (p. xiv).
Si queremos comprender, y llegar a ser capaces de prevenir, la falla más allá de 10
-7
, esto es lo que debemos
mirar. Olvidemos lo hecho mal. Olvidemos las violaciones a las reglas. Olvidemos los errores. La seguridad, y
su carencia, es una propiedad emergente.
Lo que debemos estudiar en cambio es patrones de información, las incertidumbres en la tecnología compleja
operacional, y los siempre evolutivos e imperfectos sistemas sociotécnicos circundantes para hacer que la
operación ocurra, la influencia de la escasez y competencia en esos sistemas, y cómo ellos ponen en
movimiento un incrementalismo (en sí misma una expresión del aprendizaje o la adaptación organizacionales
bajo esas presiones). Para comprender la seguridad, una organización necesita capturar las dinámicas en la
banalidad de su vida organizacional y comenzar a ver cómo el emergente colectivo se mueve hacia los límites
del desempeño seguro.
Sistemas como Relaciones Dinámicas.
El capturar y describir los procesos mediante los que las organizaciones derivan hacia la falla requiere
pensamiento de sistemas. El pensamiento de sistemas es acerca de relaciones e integración. Ve a un
sistema sociotécnico no como una estructura consistente de departamentos constituyentes, bordes suaves y
bordes agudos, deficiencias y fallas, sino como una red compleja de relaciones y transacciones dinámicas,
evolucionando. En vez de bloques constructores, el enfoque de sistemas enfatiza los principios de la
organización. El entendimiento del todo es bastante distinto del entendimiento de un ensamble de
componentes separados. En vez de lazos mecánicos entre componentes (con causa y efecto), ve
transacciones – interacciones simultáneas y mutuamente independientes. Tales propiedades emergentes son
destruidas cuando un sistema es disectado y estudiado como un lote de componentes aislados (un
administrador, departamento, regulador, fabricante, operador). Las propiedades emergentes no existen en los
niveles más bajos; incluso ellas no pueden ser descritas significativamente con lenguajes apropiados para
esos niveles inferiores.
Tomemos los procesos extensos y múltiples mediante los que la guía de mantenimiento se produjo para el
DC-9, y luego para la aeronave serie MD-80. Componentes separados (tales como regulador, fabricante,
operador) son difíciles de distinguir, y el comportamiento interesante, la clase de comportamiento que
contribuye a la deriva hacia la falla, surge sólo como resultado de relaciones y transacciones complejas. A
primera vista, la creación de la guía de mantenimiento parece ser un problema resuelto. Usted construye un
producto, usted consigue que el regulador lo certifique como seguro de utilizar, y entonces, usted le dice al
usuario como mantenerlo en orden para que permanezca seguro. Incluso el segundo paso (obtener que se
certifique como seguro) no está por ningún lado cerca de un problema resuelto, y está profundamente
entrelazado con el tercero. Más sobre ello luego: Primero la guía de mantenimiento. El Alaska 261 revela una
gran brecha entre la producción de un sistema y su operación. Indicios de la brecha aparecieron en
observaciones del hilo del perno, que fue su superior a lo esperado por el fabricante. No mucho después de la
certificación del DC-9, la gente comenzó el trabajo para intentar puentear la brecha. Reuniendo gente
procedente de toda la industria, se instaló un ―Maintenance Guidance Steering Group (MSG)‖, con el fin de
desarrollar documentación guía para el mantenimiento de aviones de transporte grandes (NTSB, 2002),
particularmente el Boeing 747. Utilizando esta experiencia, otro MSG desarrolló un nuevo documento guía, en
1970, llamado MSG-2 (NTSB, 2002), que tenía la intención de presentar un medio para desarrollar un
programa de mantenimiento aceptable para el regulador, el operador y el fabricante.
Las muchas discusiones, negociaciones y colaboraciones interorganizacionales, entre líneas del desarrollo de
un ―programa de mantenimiento aceptable‖, mostraron que el cómo mantener una pieza de tecnología
compleja una vez certificada no era un problema resuelto por completo. De hecho, era más una propiedad
emergente: La tecnología probó menos certeza de la que se había apreciado en la pizarra de dibujo (por
ejemplo, los rangos del hilo del perno del DC-9 eran más altos que lo previsto), y no fue sino hasta que golpeó
en el piso de la práctica que las deficiencias se volvieron aparentes, si uno sabía donde mirar.
En 1980, mediante esfuerzos combinados del regulador, grupos de comercio e industria y fabricantes de
ambas aeronaves y motores en tanto en Estados Unidos, como en Europa, se produjo un tercer documento
guía, llamado MSG-3 (NTSB, 2002).
Este documento tenía que clarificar las confusiones previas, por ejemplo, entre mantenimiento ―hard-time‖,
mantenimiento ―on-condition‖, mantenimiento ―condition-monitoring‖ y mantenimiento ―overhaul‖. Las
revisiones al MSG-3 fueron hechas en 1988 y 1993. Los documentos guía MSG y sus revisiones fueron
aceptados por los reguladores, y utilizados por las llamadas ―Maintenance Review Boards (MRB)‖, que
convinieron en desarrollar guías para modelos de aeronaves específicos.
Una Maintenance Review Board, o MRB, no escribe la guía ella misma, sin embargo; esto es hecho por
comités de dirección de la industria, a menudo encabezados por un regulador. Estos comités, en cambio,
dirigen varios grupos de trabajo.
Mediante todo esto, se obtuvo la producción de documentos llamados planificación de mantenimiento en la
aeronave (OAMP), así como también tarjetas de trabajo genéricas que delineaban tareas de mantenimiento
específicas. Tanto el intervalo de lubricación, como el end-play check para los pernos de trim de MD-80 fueron
los productos cambiantes constantemente, de estas redes envolventes de relaciones entre fabricantes,
reguladores, grupos de comercio y operadores, quienes estaban operando fuera de la experiencia operacional
continuamente renovada, y una base de conocimiento perpetuamente incompleta sobre la aún incierta
tecnología (recordemos, los resultados de los end-play tests no eran grabados ni seguidos). Entonces ¿cuáles
son las reglas? ¿Cuáles deben ser los estándares?
La introducción de una nueva pieza de tecnología es seguida por negociación, por descubrimiento, por la
creación de nuevas relaciones y racionalidades. ―Sistemas técnicos cambiando en modelos por ellos mismos‖,
dijo Weingart (1991), ―los observadores de su funcionamiento, y especialmente su malfuncionamiento, en una
escala real, es requerida como base para el desarrollo técnico posterior‖ (p. 8).
No existen reglas y estándares como señaladores aborígenes, inequívocas, contra una marea de datos
operacionales (y si lo hacen, rápidamente están probando ser inútiles o anticuadas). Más bien, las reglas y los
estándares son los productos constantemente actualizados de los procesos de conciliación, de dar y tomar,
de detección y racionalización de nuevos datos. Como dijo Brian Wynne (1988):
Bajo una imagen pública de comportamiento de seguimiento de reglas y la creencia asociada a que los
accidentes se deben a la desviación de estas reglas claras, los expertos están operando con niveles lejos más
grandes de ambigüedad, necesitando hacer juicios expertos en situaciones estructuradas menos que claras.
El punto clave en que sus juicios no son normalmente de una clase - ¿cómo diseñamos, operamos y
mantenemos el sistema de acuerdo a ―las‖ reglas? Las prácticas no siguen reglas, más bien, las reglas siguen
prácticas envolutivas (p. 153).
El establecer los diversos equipos, grupos de trabajo y comités, fue una forma de puentear la brecha entre la
construcción y el mantenimiento de un sistema, entre producirlo y operarlo. Pero adaptación puede significar
deriva. Y deriva puede significar quiebre.
MODELANDO SISTEMAS SOCIOTÉCNICOS VIVOS.
¿Qué clase de modelo de seguridad podría capturar tal adaptación, y predecir un colapso eventual? Los
modelos estructuralistas son limitados. Desde luego, podríamos afirmar que la extensión del intervalo de
lubricación y el poco fiable end-play check, fueron deficiencias estructurales. ¿Eran agujeros en las capas de
defensa? Absolutamente. Pero tales metáforas no nos ayudan a buscar el dónde ocurrió el agujero, o por qué.
Hay algo orgánico sobre las MSGs, algo ecológico, que se pierde cuando los modelamos como una barrera
de defensa con un agujero en ella; cuando los vemos como una mera deficiencia, o una falla latente. Cuando
vemos sistemas como internamente plásticos, flexibles, orgánicos, su funcionamiento es controlado por
relaciones dinámicas y adaptación ecológica, más que por estructuras mecánicas rígidas. Además exhiben
auto organización (de año en año, el montaje de los MSGs fue diferente) en respuesta a cambios
medioambientales, y auto trascendencia: la habilidad para elevarse más allá de los límites conocidos y
aprender, desarrollarse e incluso, mejorar.
Lo que se necesita aún no es una cuenta estructural del resultado final de deficiencia organizacional. En vez
de ello, lo necesario es una cuenta más funcional de procesos vivientes que coevolucionan con respecto a un
conjunto de condiciones medioambientales, y que mantienen una relación dinámica y recíproca con esas
condiciones (ver Heft, 2001). Tales cuentas necesitan capturar lo que sucede dentro de una organización, con
la agrupación de conocimiento y creación de racionalidad dentro de los grupos de trabajo, una vez que la
tecnología se encuentra asentada. Una cuenta funcional podría cubrir la organización orgánica de grupos y
comités de estiramiento de mantenimiento, ya que su estructura, enfoque, definición de problema y
entendimiento, coevolucionaron con las anomalías emergentes y conocimiento creciente sobre una tecnología
no acertada.
Un modelo sensible a la creación de deficiencias, no sólo a su presencia eventual, da vida a un sistema
sociotécnico. Debe ser un modelo de procesos, no sólo uno de estructura. Al extender una genealogía de
investigación cibernética y de ingeniería de sistemas, Nancy Leveson (2002) propuso que modelos de control
pueden completar parte de esta tarea. Los modelos de control usan las ideas de jerarquías y restricciones
para representar las interacciones emergentes de un sistema complejo. En su conceptualización, un sistema
sociotécnico consta de diferentes niveles, donde cada nivel superordinado impone restricciones en (o controla
lo que está sucediendo en) los niveles subordinados. Los modelos de control están en camino a comenzar a
esquematizar las relaciones dinámicas entre los diferentes niveles dentro de un sistema – un ingrediente
crítico de moverse hacia el verdadero pensamiento de sistemas (donde las relaciones y transacciones
dinámicas son dominantes, no la estructura y los componentes). El comportamiento emergente está asociado
con los límites o restricciones en los grados de libertad de un nivel en particular.
La división en niveles jerárquicos es un artefacto analítico necesario para ver cómo el comportamiento del
sistema puede emerger desde esas interacciones y relaciones. En un modelo de control, los niveles
resultantes son desde luego un producto del analista que esquematizó el modelo encima del sistema
sociotécnico. Más que reflejos de alguna realidad exterior, los patrones son construcciones de una mente
humana buscando respuestas a preguntas particulares. Por ejemplo, un MSG en particular probablemente no
vería como está superordinado a algún nivel e imponiendo restricciones en el, o subordinado a algún otro y,
por ende, sujeto a sus restricciones. De hecho, una representación jerárquica unidimensional (con sólo arriba
y abajo a lo largo de una dirección) probablemente sobre simplifique la red dinámica de relaciones
circundantes (y determinando el funcionamiento de) cualquier grupo combinado evolucionante como un MSG.
Pero todos los modelos son simplificaciones, y la analogía de los niveles puede ser de ayuda para un analista
que tiene cuestiones particulares en mente (por ejemplo, ¿por qué éstas personas, en este nivel, o en este
grupo, tomaron esas decisiones, y por qué ellos ven eso como la única forma racional de ir?).
El control entre los niveles en un sistema sociotécnico es difícilmente alguna vez perfecto.
Para efectivamente controlar, cualquier controlador necesita un buen modelo de lo que se supone que tiene
que controlar, y requiere retroalimentación sobre la efectividad de su control. Pero tales modelos internos de
los controladores fácilmente se vuelven inconsistentes, y dejan de ser compatibles, con el sistema a ser
controlado (Leveson, 2002). Los modelos de control del error son verdaderos especialmente con la tecnología
emergente, no asertiva (incluyendo pernos de trim), y con los requerimientos de mantenimiento circundantes.
La retroalimentación sobre la efectividad del control está incompleta y también puede ser poco confiable. Una
falta de incidentes relacionados con pernos podría entregar la ilusión que el control de mantenimiento es
efectivo y que los intervalos pueden ser extendidos, mientras que la rareza del riesgo en realidad depende de
factores bastante fuera del espectro del controlador. En este sentido, la imposición de restricciones en los
grados de libertad es mutua entre niveles y no sólo en el extremo inferior: Si los niveles subordinados generan
retroalimentación imperfecta sobre su funcionamiento, entonces niveles de orden más alto no tienen recursos
adecuados (grados de movimiento) para actuar como podría ser necesario. Por ende, el nivel subordinado
impone restricciones en el nivel superior al no decirle (o no poder decirle) lo que realmente está sucediendo.
Tal dinámica se ha notado en diversos casos de deriva hacia la falla, incluyendo el desastre del
Transbordador Espacial Challenger (ver Feynman, 1988).
Deriva hacia la falla como erosión de restricciones y pérdida eventual de control.
Los espirales de control anidados pueden dar vida a un modelo de un sistema sociotécnico con más facilidad
que a una línea de capas de defensa. Para modelar la deriva, esta debe tener vida. La teoría del control ve a
la deriva en falla como una erosión gradual de la calidad o del cumplimiento de las restricciones de seguridad
en el comportamiento de los niveles subordinados. La deriva resulta ya sea de restricciones perdidas o
inadecuadas sobre lo que sucede en otros niveles. El modelar un accidente como una secuencia de eventos,
en contraste, sólo está modelando en realidad el producto final de tal erosión y pérdida de control. Si la
seguridad es vista como un problema de control, entonces los eventos (tal como los agujeros en las capas de
defensa), son los resultados de problemas de control, no de causas que dirigen un sistema hacia el desastre.
Una secuencia de eventos, en otras palabras, es cuando más el punto de partida para modelar un accidente,
no la conclusión analítica. Los procesos que generan estas debilidades necesitan un modelo.
Un tipo de erosión de control ocurre a causa de que las restricciones a la ingeniería original (ejemplo,
intervalos de 300 horas) se aflojan como respuesta a la acumulación de experiencia operacional. Una
variedad del ―ajuste fino‖ de Starbuck y Milliken (1988), en otras palabras. Esto no significa que la clase de
adaptación ecológica en el control del sistema sea completamente racional, o que tiene sentido incluso desde
una perspectiva global en la evolución general y sobrevivencia eventual del sistema. No es así. Las
adaptaciones suceden, los ajustes se hacen, y las restricciones se aflojan como respuesta a las
preocupaciones locales con horizontes de tiempo limitado. Todo ello está basado en el conocimiento
incompleto, incierto. A menudo, ni siquiera está claro para los integrantes que las restricciones se han vuelto
menos apretadas como resultado en primer lugar de sus decisiones, o que importa si es así. E incluso,
cuando se encuentra claro, las consecuencias pueden ser difíciles de anticipar, y juzgadas como una pequeña
pérdida potencial en relación con las ganancias inmediatas.
Como señaló Leveson (2002), los expertos hacen su mayor esfuerzo para encontrar condiciones locales, y en
el ocupado flujo diario y la complejidad de las actividades, ellos podrían no estar alerta de cualquier efecto
colateral potencialmente peligroso de esas decisiones. Sólo con el beneficio de la retrospección, o la
supervisión omnisciente (que es utópica), que esos efectos colaterales pueden ser ligados al riesgo real.
Jensen (1996), lo describe como:
No deberíamos esperar que los expertos intervengan, ni tampoco deberíamos creer que ellos sepan siempre
lo que están haciendo. A menudo ellos no tienen idea, habiendo estado ciegos a la situación en que se
encuentran envueltos. Actualmente, no es inusual que los ingenieros y científicos que trabajan dentro de los
sistemas sean tan especializados que ya hace tiempo se dieron por vencidos tratando de entender el sistema
como un todo, con todos sus aspectos técnicos, políticos, financiero y sociales (p. 368).
Al ser miembro del sistema, entonces, pueden hacer que el pensamiento de sistemas sea todo menos
imposible.
Perrow (1984), hizo este argumento muy persuasivo, y no sólo para los integrantes del sistema. Un
incremento en la complejidad del sistema disminuye su transparencia: Elementos diversos interactúan en una
más grande variedad de formas que es difícil de prever, detectar, o incluso, comprender. Las influencias
desde fuera de la base del conocimiento técnico (esos ―aspectos políticos, financieros y sociales de Jensen,
1996, p. 368) esfuerzan una presión sutil pero poderosa en las decisiones y tratados que la gente hace, y
restringen lo que es visto como una decisión racional o un curso de acción en ese momento (Vaughan, 1996).
Por ende, incluso si los expertos pudieran estar bien educados y motivados, una ―advertencia de un
incomprensible e inimaginable evento no puede ser vista, ya que no puede ser creída‖ (Perrow, 1984, p. 23).
¿Cómo pueden los expertos y otros encargados de la toma de decisiones, en el interior de los sistemas
organizacionales tomar conciencia de los indicadores disponibles del desempeño de la seguridad de sistema?
El asegurarse que los expertos y otros encargados de la toma de decisiones están bien informados es en sí
misma una persecución vacía. Lo que en realidad significa estar bien informado en un nivel organizacional
complejo, es un criterio infinitamente negociable y claro, para lo que constituye información suficiente, son
imposibles de obtener.
Como resultado, el efecto de las creencias y premisas en la toma de decisiones y la creación de racionalidad,
pueden ser considerables. Weick (1995, p. 87), señaló que ―ver lo que uno cree y no ver aquello en lo que uno
no cree, es esencial el tener sentido. Las advertencias de lo increíble van sin ser oídas‖. Aquello que no
puede ser creído no será visto. Esto confirma que el pesimismo previo sobre el valor del sistema de reporte
más allá de 10
-7
. Incluso si eventos y advertencias relevantes terminan en el sistema de reporte (que es
dubitativo porque no se ven como advertencias incluso por aquellos que podrían hacer reportar), es incluso
más generoso presumir que el análisis experto posterior de tales bases de datos de incidentes podría tener
éxito en atraer las advertencias a la vista.
La diferencia, entonces, entre la perspicacia experta en el momento y la visión retrospectiva (después de un
accidente), es tremenda. Con la visión retrospectiva, los trabajos internos del sistema pueden volverse
lúcidos: Las interacciones y efectos colaterales son puestos a la vista. Y con la visión retrospectiva, la gente
sabe qué buscar, dónde escarbar por la descomposición, las conexiones perdidas. Detonado por el accidente
del Alaska 261, el regulador lanzó una prueba especial en el sistema de control de mantenimiento en Alaska
Airlines. Se encontró que los procedimientos establecidos en la compañía no fueron seguidos, que la
autoridad y responsabilidad no estaban bien definidas, que el control de los sistemas de aplazamiento de
mantenimiento estaba perdido, y que los programas y departamentos de control de calidad y de
aseguramiento de la calidad eran ineficientes. Además, encontró papeleo incompleto del C-check,
discrepancias de las fechas de expiración de vida útil de las partes, una falta de aprobación de ingeniería de
las modificaciones en las tarjetas de trabajo de mantenimiento e inadecuadas calibraciones de herramientas.
Los manuales de mantenimiento no especifican procedimientos u objetivos para el entrenamiento de los
mecánicos en el trabajo y las posiciones de administración claves (por ejemplo, la seguridad operacional), no
eran llenadas o no existían. En realidad, restricciones impuestas en otros niveles organizacionales eran
inexistentes, disfuncionales o estaban erosionados.
Pero ver agujeros y deficiencias en retrospectiva no es una explicación de la generación o continuidad de
existencia de esas deficiencias. No contribuye a prevenir o predecir fallas. En vez de ello, los procesos
mediante los cuales tales decisiones se determinan, y mediante cuales los tomadores de decisiones crean su
racionalidad local, son una clave al entendimiento de cómo los sistemas pueden erosionarse en el interior de
un sistema sociotécnico, complejo. ¿Por qué estas cosas tienen sentido a los tomadores de decisiones en ese
momento? ¿Por qué era todo normal, por qué no era digno de reportar, ni siquiera incluso para el regulador
encargado de supervigilar estos procesos?
Las preguntas penden en el aire. Poca evidencia se encuentra disponible de la (ya inmensa) investigación de
la NTSB en tales procesos interorganizacionales, o cómo ellos produjeron una conceptualización particular del
riesgo. El reporte, como otros, es testimonio de la tradición mecanística, estructuralista, en las pruebas del
accidente para citar, aplicada incluso a las incursiones investigativas en el territorio social-organizacional.
La creación de racionalidad local.
La pregunta es, ¿cómo los integrantes realizan numerosos intercambios, pequeños y más grandes, que juntos
contribuyen a la erosión, a la deriva? ¿Cómo es que estas decisiones aparentemente no dañinas, pueden
mover incrementalmente a un sistema hacia el borde del desastre? Como se indicó anteriormente, un aspecto
crítico de esta dinámica es que las personas, en los roles de toma de decisiones, en el interior de un sistema
sociotécnico pierde o subestima los efectos colaterales globales de sus decisiones localmente racionales.
Como ejemplo, el MSG-3 MD-80 MRB (si se pierde ahí no se preocupe, a otras personas debe haberle
sucedido también) consideró el cambio en la tarea de lubricación del perno como parte del paquete mayor del
C-check (NTSB, 2002). La junta de revisión no consultó a los ingenieros de diseño del fabricante, ni tampoco
los puso al tanto de la extensión. El documento OAMP inicial del fabricante para la lubricación del DC-9 y MD-
80, especificaba un ya extendido intervalo de 600 a 900 horas (partiendo desde la recomendación de 1964
para 300 horas), tampoco fue considerada en el MSG-3. Desde una perspectiva local, con la presión de los
límites y restricciones de tiempo en el conocimiento disponible, la decisión de extender el intervalo sin el input
experto adecuado debe haber tenido sentido. Las personas consultadas en el momento deben haberse
estimado suficiente y adecuadamente expertos para sentirse lo suficientemente cómodos para continuar. La
creación de racionalidad debe haberse visto como satisfactoria. De otra forma, es difícil creer que el MSG-3
pudiera haber procedido como lo hizo. Pero los efectos colaterales eventuales de estas decisiones menores
no fueron previstos. Desde una perspectiva mayor, a la brecha entre producción y operación, entre hacer y
mantener un producto, una vez más se le permitió ensancharse. Una relación que debió haber sido
instrumental en contribuir a puentear la brecha (consultado con los ingenieros de diseño originales quienes
hicieron la aeronave, para informar a aquellos que lo mantuvieron), una relación desde la historia al
(entonces) presente, fue desatada. Una transacción no fue completa. Si no tuvo sentido la previsión de
efectos colaterales para el MSG-3 MD-80 MRB (y esto bien puede haber sido un resultado banal de la pura
complejidad y burocracia del mandato de trabajo), pudo tampoco haber tenido sentido para los participantes
en un sistema sociotécnico siguiente. Estas decisiones son sonido cuando se fijan contra el criterio de juicio
local; dadas las presiones presupuestarias y de tiempo e incentivos de corto plazo que amoldan el
comportamiento. Dado el conocimiento, metas y foco atencional de los encargados de la toma de decisiones,
y la naturaleza de los datos disponibles para ellos en ese momento, tiene sentido. Es en estos procesos
normales, del día a día, donde podemos encontrar las semillas de la falla y el éxito organizacionales. Y es en
estos procesos que debemos volcarnos, para encontrar la influencia para realizar mayores progresos en la
seguridad. Como señalaron Rasmussen y Svedung (2000):
Para planificar una estrategia de administración del riesgo preactiva, tenemos que comprender los
mecanismos que generan el comportamiento real de los tomadores de decisiones a todo nivel… un enfoque a
la administración de riesgo preactiva incluye los siguientes análisis:
Un estudio de actividades normales de los actores que se encuentran preparando el escenario de los
accidentes durante su trabajo normal, junto con un análisis de las apariencias del trabajo que forman su
comportamiento de toma de decisiones.
Un estudio del ambiente de información presente de estos actores y de la estructura de flujo de
información, analizado desde un punto de vista teórico de control (p. 14).
La reconstrucción o el estudio del ―ambiente de información‖, en el cual las decisiones reales son
conformadas, en que se construye la racionalidad local, puede contribuirnos a penetrar en los procesos de
toma de sentido organizacionales. Estos procesos descansan en la raíz del aprendizaje y adaptación
organizacionales, y por ende, en la fuente de la deriva hacia la falla. Los dos accidentes de transbordadores
espaciales (Challenger en 1986 y columbia en 2002) son altamente instructivos aquí, principalmente porque la
Junta de Investigación del Accidente del Columbia (CAIB), así como análisis posteriores del desastre del
Challenger (por ejemplo, Vaughan, 1996), representan significativas (y, para citar, más bien únicos) salidas
desde las pruebas estructuralistas típicas en tales accidentes. Los análisis toman con seriedad los procesos
organizacionales normales hacia la deriva, aplicando e incluso extendiendo un lenguaje que nos contribuya a
capturar algo esencial sobre la creación continua de racionalidad local por los encargados de la toma de
decisión organizacionales.
Un aspecto crítico del ambiente de información en que los ingenieros de NASA tomaron decisiones sobre
seguridad y riesgo fue ―bullets (proyectiles)‖. Richard Feynman, quien participó en la Comisión Presidencial
Rogers original, investigando el desastre del Challenger, ya había tronado contra ellos y la forma en que
habían colapsado los juicios ingenieriles en afirmaciones agrietadas: ―Entonces aprendimos acerca de los
―bullets‖- pequeños círculos negros frente a frases que se supone que debían sintetizar cosas. Había uno
después de otro de estos pequeños condenados proyectiles en nuestros libros de briefing y en las
diapositivas‖ (Feynman, 1988, p. 127).
De manera inquietante, los ―bullets‖ aparecían nuevamente como un recorte en la investigación del accidente
del Columbia en 2003. Con la proliferación de software comercial para hacer presentaciones ―bulletized
(amunicionadas)‖, desde el Challenger, los proyectiles proliferaron también. Esto también puede haber sido el
resultado de intercambios racionales locales (sin embargo, tremendamente irreflectivos) para incrementar la
eficiencia: Las presentaciones amunicionadas colapsaron la información y las conclusiones y lidiaron con
papeles más rápidos que técnicos. Pero los proyectiles llenaron el ambiente de información de los ingenieros
y administradores de NASA, al costo de otros datos y representaciones. Ellas dominaron el discurso técnico
hasta un punto, determinando la toma de decisiones en base a lo que podría ser considerado como
información suficiente para el asunto a la mano. Las presentaciones amunicionadas fueron esenciales en la
creación de racionalidad local, y en empujar a esa racionalidad incluso más lejos del riesgo real mezclándose
apenas debajo.
Edgard Tufte (CAIB, 2003) analizó una diapositiva en particular del Columbia, de una presentación dada a la
NASA por un contratista en febrero de 2003. El ánimo de la diapositiva era ayudar a NASA a considerar el
daño potencial a las celdas de calor creado por los restos de hielo que habían caído desde el tanque principal
de combustible (celdas de calor dañadas, percutaron la destrucción del Columbia en su viaje de regreso hacia
la atmósfera terrestre, ver Fig. 2.5.) La diapositiva fue utilizada por el Equipo de Cálculo de Restos en su
presentación para la Sala de Evaluación de la Misión. Fue titulada ―Revisión de los datos de prueba indican
conservadurismo para la penetración de la celda‖, sugiriendo, en otras palabras, que el daño hecho al ala no
era tan malo (CAIB, 2003, p. 191). Pero en realidad, el título no hacía ninguna referencia al daño de la celda.
Más bien, se apuntaba a la elección de modelos de prueba utilizados para predecir el daño. Un título más
apropiado, de acuerdo con Tufte, podría haber sido ―Revisión de datos de prueba indican irrelevancia de dos
modelos‖. La razón era que la pieza de restos de hielo que golpeó al Columbia, se estimaba que era 640
veces más grande que los datos utilizados para calibrar el modelo en el que los ingenieros basaron sus
cálculos del daño (análisis posterior mostró que el objeto de desperdicio era en realidad 400 veces mayor).
Así que los modelos de calibración en realidad no fueron de mucho uso: Ellos subestimaron con largueza el
impacto real del desperdicio. La diapositiva siguió en decir que ―energía significativa‖ podría haber sido
requerida para que el desperdicio del estanque principal penetrara la (supuestamente más dura) envoltura de
celdas del ala del transboradador, debido a que los resultados mostraron que esto era posible con suficiente
masa y velocidad, y que, una vez que las celdas habían sido penetradas, daño significativo podría haber sido
causado. Como Tufte observó, la vagamente cuantitativa palabra ―significativa‖ o ―significativamente‖, fue
utilizada cinco veces en una diapositiva, pero su significado cubría todo el rango desde la capacidad de ver
utilizando esas pruebas de calibración irrelevantes, a través de una diferencia de 640 pliegues, hasta el
daño tan grande que todos los que
FIG. 2.5. Ubicación de los impulsores de cohete sólido (Challenger) y tanque de combustible
externo (Columbia), en un transbordador espacial.
estaban a bordo pudieron morir. La misma palabra, la misma indicación en una diapositiva, repetida cinco
veces, acarreando cinco profundamente (sí, significativamente) distintos significados, sin embargo ninguno de
ellos era realmente explícito, a causa del condensado formato de la diapositiva. Similarmente, el daño a las
celdas protectoras de calor fue oscurecido tras una pequeña palabra, ―eso‖, en una frase que se leía como
―Los resultados de las pruebas muestran que eso es posible con suficiente masa y velocidad‖ (CAIB, 2003, p.
Estanque externo de
combustible
Cohete impulsor
sólido (también
motor cohete sólido
o SRM)
Orbitador
191). La diapositiva debilitó material importante, y la naturaleza de amenaza a la vida de los antecedentes en
ella se perdió tras proyectiles y afirmaciones abreviadas.
Una década y media más tarde, Feynman (1988), descubrió una diapositiva similarmente ambigua acerca del
Challenger. En su caso, los proyectiles habían declarado que el sello en erosión en las uniones del campo fue
―más crítico‖ para la seguridad de vuelo, sin embargo, ese ―análisis de los antecedentes existentes indicaron
que es seguro continuar volando el diseño existente‖ (p. 137). El accidente probó que no era así. Los
impulsores de cohete sólidos (o SRBs o SRMs) que ayudan al transbordador espacial a salir de la atmósfera
terrestre, están segmentados, lo que hace más fácil su transporte terrestre y tiene algunas otras desventajas.
Un problema que se descubrió temprano en la operación del transbordador, sin embargo, fue que los cohetes
sólidos no siempre se sellaban apropiadamente en estos segmentos, y que los gases calientes podrían
escaparse a través de los O-rings de goma en el sello, llamado blow-by. Esto eventualmente llevó a la
explosión del Challenger, en 1986. La diapositiva de pre-accidente tomada por Feynman había declarado que
mientras la falta de un sello secundario en la unión (del motor del cohete sólido) era ―más crítica‖, aún era
―seguro continuar volando‖. Al mismo tiempo, los esfuerzos necesitaban ser ―acelerados‖ para eliminar la
erosión del sello del SRM (1988, p. 137). Durante el Columbia, así como también, en el Challenger, las
diapositivas no fueron utilizadas sólo para apoyar las decisiones técnicas y operativas que derivaron en los
accidentes.
Incluso durante ambas investigaciones post-accidente, las diapositivas con presentaciones amunicionadas,
fueron ofrecidas como sustitutos para los análisis y datos técnicos, ocasionando que la CAIB (2003), similar a
Feynman años antes, alegara que: ―La junta percibe el uso endémico de diapositivas de briefing de
PowerPoint, en vez de papeles técnicos, como una ilustración de los métodos problemáticos de comunicación
técnica en la NASA‖ (p. 191).
La sobre utilización de proyectiles y diapositivas, ilustra el problema de ambientes de información y cómo el
estudiarlos puede contribuirnos a entender algo sobre la creación de racionalidad local en la toma de
decisiones organizacional se configura como un ―nicho epistémico‖ (Hoven, 2001). El que esos tomadores de
decisiones puedan saber, está generado por otras personas, y se distorsiona durante la transmisión a través
de un medio reduccionista, abreviativo (este nicho epistémico también tiene implicancias en cómo podemos
pensar acerca de la culpa, o la culpabilidad de las decisiones y quienes las toman –ver cáp. 10). Lo restringido
e incompleto del nicho en el que los tomadores de decisiones se encuentran puede venir como inquietante
para los observadores retrospectivos, incluyendo la gente dentro y fuera de la organización. Fue después del
accidente del Columbia que el Equipo de Administración de la Misión ―admitió que el análisis utilizado para
continuar volando era, en una palabra, ―pésima‖. Esta admisión – que lo racional para volar era sellado en
goma – es, por decirlo menos, inquietante‖ (CAIB, 2003, p. 190).
―Inquietante‖ puede ser, y probablemente lo es – en retrospectiva. Pero desde el interior, la gente en las
organizaciones no gasta una vida profesional tomando decisiones ―inquietantes‖. Más bien, ellos realizan
trabajo generalmente normal. Nuevamente, ¿cómo puede un administrador ver un proceso ―pésimo‖ para
evaluar la seguridad de vuelo como normal, y no como algo que es digno de reportar o reparar? ¿Cómo es
posible que sea normal este proceso? La CAIB (2003), por sí misma, encontró claves a las respuestas en las
presiones de escasez y competencia:
Se supone que el proceso de estar Listo para el Vuelo debe estar escudado de la influencia externa, y es visto
tanto riguroso como sistemático. Sin embargo, el Programa del Transbordador está inevitablemente
influenciado por factores externos, incluyendo, en el caso de STS-107, demandas de agenda. Colectivamente,
tales factores dan forma a cómo el Programa establece calendarios de misión y fija prioridades financieras, las
que afectan la visión general de la seguridad, niveles de fuerza de trabajo, mantenimiento de instalaciones y
cargas de trabajo de contratistas. Finalmente, las esperanzas y presiones externas impactan incluso en la
recolección de datos, análisis de tendencias, desarrollo de información y reporte y disposición de anomalías.
Estas realidades contradicen a la creencia optimista de NASA que las revisiones de prevuelo entregan
salvaguardias reales contra los riesgos inaceptables (2003, p. 191).
Quizá no exista tal cosa como la toma de decisiones ―rigurosa y sistemática‖, basada en la mera experiencia
técnica. Las esperanzas y las presiones, las prioridades presupuestarias y calendarios de misión, las cargas
de trabajo a los contratistas, y los niveles de fuerza de trabajo, todos impactan en la toma de decisiones
técnica. Todos estos factores determinan y restringen lo que será visto como cursos de acción posibles y
racionales en el momento. Esto viste el nicho epistémico en que los tomadores de decisiones se encuentran
en frecuencias y patrones bastante más variadas que los antecedentes técnicos únicamente. Pero al suponer
que algunos tomadores de decisiones podrían ver a través de todos estos vestidos en el interior de sus nichos
epistémicos, y alertar a otros a hacerlo. Existen historias de tales delatores. Incluso la información de un nicho
epistémico (el ambiente de información, podría haber sido vista y acordada desde el interior en el momento, lo
que aún no significa que garantice cambio o mejora. El nicho, y la forma en que la gente se configure en él,
responden a otras preocupaciones y presiones que están activas en la organización – eficiencia y velocidad
de procesos de briefing y toma de decisiones, por ejemplo.
El impacto de esta información imperfecta, incluso si es acordada, es subestimado debido a que al ver los
efectos colaterales, o las conexiones al riesgo real, rápidamente planea hacia fuera de las capacidades
computacionales de los tomadores de decisión organizacionales y de los mecanismos del momento.
Al estudiar los ambientes de información, cómo están creados, sostenidos y racionalizados, y en cambio,
cómo contribuyen a apoyar y racionalizar las decisiones complejas y riesgosas, es un camino al entendimiento
organizacional de la toma de sentido. Se dirá más de estos procesos de toma de sentido en alguna parte de
este libro. Es una forma de hacer lo que los sociólogos llaman la conexión macromicro. ¿Cómo es que esas
presiones globales de producción y escasez, encuentran su camino en los nichos de decisiones locales, y
cómo es que ellos ejercitan entonces su a menudo invisible pero poderosa influencia en lo que la gente cree y
prefiere, lo que la gente entonces y allá ve como racional o indistinguible? Ya que la intención fue que las
evaluaciones de seguridad de vuelo de la NASA estuvieran escudadas de esas presiones externas, no es
menos cierto que esas presiones se filtraron incluso hasta la recolección de datos, análisis de tendencias y
reporte de anomalías.
Los ambientes de información creados como consecuencia por los tomadores de decisiones fueron continua e
insidiosamente tentados por las presiones de producción y escasez (¿y en qué organización no lo son?),
influenciando preracionalmente la forma en que la gente veía el mundo. Sin embargo, incluso este ―pésimo‖
proceso fue considerado como normal – normal o suficientemente inevitable, en cualquier caso, para no
garantizar el gasto de energía y capital político en tratar de cambiar. El resultado puede ser la deriva hacia la
falla.
RESISTENCIA DE INGENIERÍA EN EL INTERIOR DE LAS ORGANIZACIONES.
En todos los sistemas abiertos está continuamente siendo corregida la deriva, dentro de sus envoltorios de
seguridad. Las presiones de escasez y competencia, la falta de transparencia y el tamaño de los sistemas
complejos, los patrones de información que rodean a quienes toman decisiones, y la naturaleza
incrementalista de sus decisiones a través del tiempo, puede causar que los sistemas deriven hacia la falla.
La deriva está generada por procesos normales de reconciliación de las presiones diferenciales en una
organización (eficiencia, utilización de la capacidad, seguridad) contra un fondo de tecnología no asertiva y
conocimiento imperfecto.
La deriva es sobre el incrementalismo contribuyendo a eventos extraordinarios, sobre la transformación de las
presiones de escasez y competencia en mandatos organizacionales, y sobre la normalización de señales de
peligro de forma tal que las metas organizacionales y sobre la normalización de las señales de peligro de
forma tal que las metas organizacionales y los cálculos y decisiones supuestamente normales se pongan
alineados. En los sistemas seguros, los procesos reales que normalmente garantizan la seguridad y generan
el éxito organizacional, pueden también ser responsables de la ruina de la organización. La misma vida
sociotécnica interconectada, compleja, que rodea a la operación de tecnología exitosa, es en gran medida,
responsable por su falla potencial. Porque estos procesos son normales, porque son parte y parcela de la vida
organizacional funcional, normal, son difíciles de identificar y soltar. El rol de estas fuerzas invisibles y no
acordadas, puede ser atemorizante. Consecuencias dañinas pueden ocurrir en organizaciones construidas
para prevenirlas. Consecuencias dañinas pueden ocurrir incluso cuando todos siguen las reglas (Vaughan,
1996).
La dirección en que la deriva presiona a la operación de la tecnología puede ser difícil de detectar, también o
quizá especialmente debido a aquellos en su interior. Puede ser incluso difícil de detener. Dada la diversidad
de fuerzas (presiones políticas, financieras y económicas, incertidumbre tecnológica, conocimiento
incompleto, procesos de resolución de problemas fragmentados) tanto en el interior como en el exterior, los
sistemas sociotécnicos, grandes, complejos, que operan algunas de nuestras más peligrosas tecnologías hoy,
parecen capaces de generar una energía oscura y derivar a voluntad, a ser relativamente inmunes a la
inspección externa o control interno.
Recordemos que, en un vuelo normal, se supone que la estructura del perno de un MD-80 debería soportar
una carga de alrededor de 5.000 libras. Pero en realidad, esta carga nació de un sistema débil, poroso y
continuamente cambiante, de enseñanza enferma y procedimientos no prácticos, delegados al nivel operador
que rutinariamente, pero siempre sin éxito, intentó cerrar la brecha entre producción y operación, entre
fabricar y mantener. Cinco mil libras de carga en una suelta y variable colección de procedimiento y prácticas
fueron lenta e incrementalmente excavando su vía a través de las amenazas a la tuerca. Fue el sistema
sociotécnico diseñado para apoyar y proteger la tecnología incierta, no la parte mecánica, lo que tenía que
llevar esa carga. Cedió. El reporte del accidente acordó que eliminar el riesgo de las fallas catastróficas
simples puede no ser siempre posible a través del diseño (ya que el diseño es una reconciliación entre
restricciones irreconciliables). Concluyó que ―cuando no existen alternativas de diseño practicables, es
necesario un proceso de inspección y mantenimiento sistémico comprensivo‖ (NTSB, 2002, p. 180).
La conclusión, en otras palabras, era tener un sistema no redundante (el perno simple y el tubo de torque)
hecho redundante a través de un conglomerado regulatorio organizacional de verificación de mantenimiento y
aeronavegabilidad. El reporte fue forzado a concluir que el último resorte debió ser una contramedida en la
que ya se habían gastado 250 páginas probando que no funciona.
La deriva hacia la falla posee riesgo sustancial para los sistemas seguros. Reconocer y redireccionar la deriva
es una competencia que yace ante cualquier organización en la frontera de 10
-7
. Ningún sistema de transporte
en uso hoy en día ha atravesado esta barrera, y el éxito en romper con la asíntota en el progreso en la
seguridad no tendría que venir con extensiones de los enfoques estructuralistas mecánicos en vigencia. La
seguridad es una propiedad emergente, y su erosión no es sobre la fractura o falta de calidad de
componentes particulares. Esto hace a la combinación de gestión con calidad y gestión de seguridad,
contraproducentes. Muchas organizaciones tienen a la gestión de seguridad y gestión con calidad, envueltas
en una función o departamento. Sin embargo, la gestión con calidad es sobre componentes particulares,
sobre ver cómo alcanzar especificaciones particulares, sobre remover o reparar componentes defectuosos. La
gestión de seguridad tiene ahora poco que hacer con componentes particulares. Es un completamente
diferente nivel de entendimiento, un vocabulario completamente diferente es necesario para comprender la
seguridad, en contraste con la calidad.
La deriva hacia la falla no es tanto sobre quiebres o malfuncionamiento de componentes, como sí lo es acerca
de una organización no adaptándose efectivamente para poder con la complejidad de su propia estructura y
medioambiente. La resistencia organizacional no es una propiedad, es una capacidad: Una capacidad para
reconocer los límites de las operaciones seguras, una capacidad para apartarse de ellos en una forma
controlada, una capacidad para recuperarse de una pérdida de control si ocurre. Esto significa que los
factores humanos y la seguridad de sistemas deben encontrar nuevas formas de resistencia de ingeniería
dentro de las organizaciones, de equipar las organizaciones con una capacidad para reconocer, y recuperarse
de, una pérdida de control. ¿Cómo puede una organización monitorear sus propias adaptaciones (y cómo
esto raya en la racionalidad de quienes toman decisiones) a las presiones de escasez y competencia, porque
ello puede ser cuando tales inversiones más se necesiten. Prevenir una deriva hacia la falla requiere una
clase diferente de aprendizaje y monitoreo organizacional. Significa reparar sobre variables de orden superior,
sumando un nuevo nivel de inteligencia y análisis al reporte de incidentes y al conteo del error que es hecho
hoy en día.
Más sobre esto será dicho en los capítulos siguientes.
Capítulo 3
¿Por qué son más peligrosos los Doctores que los Propietarios de Armas?
Existen alrededor de 700.000 físicos en Estados Unidos. El Institudo de Medicina de Estados Unidos estima
que entre 44.000 y 98.000 personas mueren como resultado de errores médicos (Kohn, Corrigan %
Donaldson, 1999). Esto hace una tasa por muerte accidental anual por doctor de entre 0,063 y 0,14. En otras
palabras, hasta uno de cada siete doctores matará a un paciente cada año, por equivocación. Tomemos a los
propietarios de armas en contraste. Existen 80.000.000 de propietarios de armas en Estados Unidos. Sin
embargo, sus errores llegan ―sólo‖ a 1.500 muertes accidentales por armas al año. Esto significa que la tasa
de muerte accidental causada por error de propietarios de armas, es de 0,000019 por propietario de arma al
año. Sólo 1 de 53.000 propietarios matará a alguien por error. Los doctores entonces, son 7.500 veces más
propensos a matar a alguien por error. Mientras que no todo el mundo tiene un arma, casi todos tienen un
doctor (o muchos doctores) y están, por ende, severamente expuestos al problema del error humano.
Como las organizaciones y otros grupos de presión (por ejemplo, grupos de la industria y el comercio,
reguladores) intentan calcular la ―salud de seguridad‖ de sus operaciones, contar y tabular errores parece ser
una medida significativa. No sólo entrega una estimación numérica inmediata, de la probabilidad de muerte
accidental, lesiones, o cualquier otro evento indeseable, además permite la comparación de sistemas y
componentes de ello (este hospital vs. ese hospital, esta aerolínea vs. esa, esta flota de aeronaves o pilotos
vs. esa, estas rutas vs. esas rutas). El mantener un seguimiento de los eventos adversos está pensado para
entregar acceso relativamente simple, fácil y certero, a los trabajos de seguridad internos en un sistema. Más
aún, los eventos adversos pueden ser vistos como la partida – o la razón– para probar más a fondo, con el fin
de buscar las amenazas medioambientales o las condiciones desfavorables que pueden ser cambiadas para
prevenir su recurrencia. Desde luego, también está la pura curiosidad científica de tratar de entender
diferentes tipos de eventos adversos, diferentes tipos de errores. El categorizar, después de todo, ha sido
fundamental para la ciencia desde los tiempos modernos.
Sobre las décadas pasadas, los factores humanos en el transporte se han empecinado en cuantificar los
problemas de seguridad y encontrar fuentes potenciales de vulnerabilidad y falla. Han engendrado una
cantidad de sistemas de clasificación del error.
Algunos clasifican a los errores de decisión junto con las condiciones que han contribuido a su producción,
algunos tienen una meta específica, por ejemplo, categorizar los problemas de transferencia de información
(por ejemplo, instrucciones, errores durante la observación de los briefings de cambio, fallas de coordinación);
otros tratan de dividir las causas del error en factores cognitios, sociales y situacionales (físicos,
medioambientales, ergonómicos); sin embargo, otros intentan clasificar las causas del error a lo largo de
lineas de un modelo linear de procesamiento de la información, o o un modelo de toma de decisiones, y
algunos aplican la metáfora del queso suizo (por ejemplo, los sistemas tienen muchas capas de defensa, pero
todas ellas tienen agujeros) en la identificación de errores y vulnerabilidades hasta la cadena causal.
Los sistemas de clasificación del error son usados tanto luego de un evento (por ejemplo, durante las
investigaciones de incidentes) o para observaciones del rendimiento humano corriente.
MIENTRAS MAS MEDIMOS, MENOS SABEMOS.
En perseguir la categorización y tabulación de los errores, factores humanos hace una cantidad de
suposiciones y adopta ciertas posiciones filosóficas. Poco de esto está explicado en la descripción de estos
métodos, y sin embargo, acarrea consecuencias para la utilidad y calidad del conteo del error como una
medida de bienestar de seguridad y como herramienta para dirigir los recursos para el mejoramiento. Aquí
hay un ejemplo. En uno de los métodos, se pide al observador que distinga entre ―errores de procedimiento‖ y
―errores de eficiencia‖. Los errores de eficiencia están relacionados con una falta de habilidades, experiencia o
práctica (reciente), mientras que los errores de procedimiento son aquellos que ocurren al llevar a cabo
secuencias de acción prescritas o normadas (por ejemplo, listas de verificación). Esto se ve directo y sencillo.
Sin embargo, como reportó Croft (2001), el siguiente problema confronta el observador: un tipo de error (un
piloto ingresando una altitud errada en el computador de vuelo) puede legítimamente terminar en cualquiera
de las dos categorías del método de conteo del error (un error de procedimiento o un error de eficiencia). ―Por
ejemplo, ingresar la altitud errada en el sistema de administración de vuelo (FMS), es considerado un error de
procedimiento… No saber cómo utilizar ciertos equipos automatizados en un computador de vuelo de una
aeronave es considerado un error de eficiencia‖ (p. 77).
Si un piloto ingresa la altitud erronea en el FMS ¿es un asunto de procedimiento, de eficiencia o ambos?
¿Cómo debe ser categorizado? Thomas Kuhn (1962), impulsó a la ciencia a girar hacia la filosofía creativa
cuando se enfrentó con las inclinaciones de los problemas en relacionar la teoría con las observaciones
(como en el problema de categorizar una observación en clases teóricas). Puede ser una manera efectiva de
dilucidar y, si es necesario, debilitar el control de una tradición en la mente colectiva, y sugerir las bases para
una nueva. Esto es ciertamente apropiado cuando surgen las preguntas epistemiológicas: preguntas de cómo
estamos en el conocimiento de lo que (creemos) conocemos. Para entender la clasificación del error y
algunos de sus problemas asociados, debemos intentar encajar en un análisis breve de la tradición filosófica
contemporánea que gobierna la investigación de factores humanos y la visión global en que tiene lugar.
Realismo: Los errores existen: Los puedes descubrir con un buen método.
La posición que factores humanos toma cuando utiliza herramientas de observación para medir los ―errores‖,
es una realista: Presume que hay mundo real, objetivo, con patrones de verificación que pueden ser
observados, categorizados y previstos. En este sentido, los errores son una clase de hecho Durkheimiano.
Emile Durkheim, un padre fundador de la sociología, creyó que la realidad social está objetivamente ―allá
afuera‖, disponible para escrutinio empírico imparcial, neutral. La realidad existe, vale la pena luchar por la
verdad. Por supuesto, existen obstáculos para obtener la verdad, y la realidad puede ser difícil de mantener.
Sin embargo, el perseguir una diagramación cercana o una correspondencia a esa realidad es una meta
válida y legítima del desarrollo teórico. Es esa meta, ella.
truth is worth striving for. Of course, there are obstacles to getting to the truth, and reality can be hard to pin
down. Yet pursue a close map or correspondence to that reality is a valid, legitimate goal of theory
development. It is that goal, of achieving a close mapping to reality, that governs error-counting methods.
If there are difficulties in getting that correspondence, then these difficulties are merely methodological in
nature. The difficulties call for refinement of the observational instruments or additional
training of the observers.
These presumptions are modernist; inherited from the enlightened ideas of the Scientific Revolution. In the
finest of scientific spirits, method is called on to direct the searchlight across empirical reality, more method
is called on to correct ambiguities in the observations, and even more method is called on to break open new
portions of hitherto unexplored empirical reality, or to bring into focus those portions that so far were vague
and elusive. Other labels that fit such an approach to empirical reality could include positivism, which holds
that the only type of knowledge worth bothering with is that which is based directly on experience. Positivism is
associated with the doctrine of Auguste Comte: The highest, purest (and perhaps only true) form of knowledge
is a simple description of sensory phenomena.
In other words, if an observer sees an error, then there was an error. For example, the pilot failed to arm the
spoilers. This error can then be written up and categorized as such.
But positivist has obtained a negative connotation, really meaning "bad" when it comes to social science
research. Instead, a neutral way of describing the position of error-counting methods is realist, if naively so.
Oper-49 3. DOCTORS AND GUN OWNERS
ating from a realist stance, researchers are concerned with validity (a measure of that correspondence they
seek) and reliability. If there is a reality that can be captured and described objectively by outside observers,
then it is also possible to generate converging evidence with multiple observers, and consequently achieve
agreement about the nature of that reality. This means reliability: Reliable contact has been made with
empirical reality, generating equal access and returns across observations and observers. Error counting
methods rely on this too: It is possible to tabulate errors from different observers and different observations
(e.g., different flights or airlines) and build a common database that can be used as some kind of aggregate
norm against which new and existing entrants can be measured.
But absolute objectivity is impossible to obtain. The world is too messy for that, phenomena that occur in the
empirical world too confounded, and methods forever imperfect. It comes as no surprise, then, that error-
counting methods have different definitions, and different levels of definitions, for error, because error itself is a
messy and confounded phenomenon:
• Error as the cause of failure, for example, the pilot's failure to arm the spoilers led to the runway overrun.
• Error as the failure itself: Classifications rely on this definition when categorizing the kinds of observable
errors operators can make (e.g., decision errors, perceptual errors, skill-based errors; Shappell & Wiegmann,
2001) and probing for the causes of this failure in processing or performance.
According to Helmreich (2000), "Errors result from physiological and psychological limitations of humans.
Causes of error include fatigue, workload, and fear, as well as cognitive overload, poor interpersonal
communications, imperfect information processing, and flawed decision making"
(p. 781).
• Error as a process, or, more specifically, as a departure from some kind of standard: This standard may
consist of operating procedures. Violations, whether exceptional or routine (Shappell & Wiegmann), or
intentional or unintentional (Helmreich), are one example of error according to the process definition.
Depending on what they use as standard, observers of course come to different conclusions about what is an
error.
Not differentiating among these different possible definitions of error is a well-known problem. Is error a cause,
or is it a consequence? To the errorcounting methods, such causal confounds and messiness are neither
really surprising nor really problematic. Truth, after all, can be elusive. What matters is getting the method
right. More method may solve problems of method. That is, of course, if these really are problems of method.
The modernist would say "yes." "Yes" would be the stock answer from the Scientific Revolution onward.
Methodological wrestling with empirical reality, 50 CHAPTER 3 where empirical reality plays hard to catch and
proves pretty good at the game, is just that: methodological. Find a better method, and the problems
go away. Empirical reality will swim into view, unadulterated.
Did You Really See the Error Happen?
The postmodernist would argue something different. A single, stable reality that can be approached by the best
of methods, and described in terms of correspondence with that reality, does not exist. If we describe reality in
a particular way (e.g., this was a "procedure error"), then that does not imply any type of mapping onto an
objectively attainable external reality—close or remote, good or bad. The postmodernist does not deal in
referentials, does not describe phenomena as though they reflect or represent something stable, objective,
something "out there." Rather, capturing and describing a phenomenon is the result of a collective generation
and agreement of meaning that, in this case, human factors researchers and their industrial counterparts have
reached. The reality of a procedure error, in other words, is socially constructed. It is shaped by and dependent
on models and paradigms of knowledge that have evolved through group consensus.
This meaning is enforced and handed down through systems of observer training, labeling and communication
of the results, and industry acceptance and promotion. As philosophers like Kuhn (1962) have pointed out,
these paradigms of language and thought at some point adopt a kind of self-sustaining energy, or "consensus
authority" (Angell & Straub, 1999). If human factors auditors count errors for managers, they, as (putatively
scientific) measurers, have to presume that errors exist. But in order to prove that errors exist, auditors have to
measure them. In other words, measuring errors becomes the proof of their existence, an existence that
was preordained by their measurement. In the end, everyone agrees that counting errors is a good step
forward on safety because almost everyone seems to agree that it is a good step forward.
The practice is not questioned because few seem to question it. As the postmodernist would argue, the
procedural error becomes true (or appears to people as a close correspondence to some objective reality) only
because a community of specialists have contributed to the development of the tools that make it appear so,
and have agreed on the language that makes it visible. There is nothing inherently true about the error at all. In
accepting the utility of error counting, it is likely that industry accepts its theory (and thereby the reality and
validity of the observations it generates) on the authority of authors, teachers, and their texts, not because of
evidence. In his headline, Croft (2001) announced that researchers have now perfected ways to monitor pilot
performance in the cockpit. "Researchers" have "perfected." There is little that 51 DOCTORS AND GUN
OWNERS an industry can do other than to accept such authority. What alternatives have they, asks Kuhn, or
what competence?
Postmodernism sees the "reality" of an observed procedure error as a negotiated settlement among informed
participants. Postmodernism has gone beyond common denominators. Realism, that product and
accompaniment of the Scientific Revolution, assumes that a common denominator can be found for all
systems of belief and value, and that we should strive to converge on those common denominators through
our (scientific) methods.
There is a truth, and it is worth looking for through method. Postmodernism, in contrast, is the condition of
coping without such common denominators. According to postmodernism, all beliefs (e.g., the belief that you
just saw a procedural error) are constructions, they are not uncontaminated encounters with, or
representations of, some objective empirical reality. Postmodernism challenges the entire modernist culture
of realism and empiricism, of which error counting methods are but an instance.
Postmodernist defiance not only appears in critiques against errorcounting but also reverberates throughout
universities and especially the sciences (e.g., Capra, 1982). It never comes away unscathed, however. In
the words of Varela, Thompson, and Rosch (1991), we suffer from "Cartesian anxiety." We seem to need the
idea of a fixed, stable reality that surrounds us, independent of who looks at it. To give up that idea would be to
descend into uncertainty, into idealism, into subjectivism. There would be no more groundedness, no longer a
set of predetermined norms or standards, only a constantly shifting chaos of individual impressions, leading to
relativism and, ultimately, nihilism. Closing the debate on this anxiety is impossible.
Even asking which position is more "real" (the modernist or the postmodernist one) is capitulating to (naive)
realism. It assumes that there is a reality that can be approximated better either by the modernists or
postmodernists.
Was This an Error? It Depends on Who You Ask
Here is one way to make sense of the arguments. Although people live in the same empirical world (actually,
the hard-core constructionist would argue that there is no such thing), they may arrive at rather different, yet
equally valid, conclusions about what is going on inside of it, and propose different vocabularies and models to
capture those phenomena and activities. Philosophers sometimes use the example of a tree. Though at first
sight an objective, stable entity in some external reality, separate from us as observers, the tree can mean
entirely different things to someone in the logging industry as compared to, say, a wanderer in the Sahara.
Both interpretations can be valid because validity is measured in terms of local relevance, situational
applicability, and social acceptability—not in terms of 52 CHAPTER 3 correspondence with a real, external
world. Among different characterizations of the world there is no more real or more true. Validity is a function of
how the interpretation conforms to the worldview of those to whom the observer makes his appeal.
A procedure error is a legitimate, acceptable form of capturing an empirical encounter only because there is a
consensual system of like-minded coders and consumers who together have agreed on the linguistic label.
The appeal falls onto fertile ground. But the validity of an observation is negotiable. It depends on where the
appeal goes, on who does the looking and who does the listening. This is known as ontological relativism:
There is flexibility and uncertainty in what it means to be in the world or in a particular situation. The ontological
relativist submits that the meaning of observing a particular situation depends entirely on what the observer
brings to it. The tree is not just a tree. It is a source of shade, sustenance, survival. Following Kant's ideas,
social scientists embrace the common experience that the act of observing and perceiving objects (including
humans) is not a passive, receiving process, but an active one that engages the observer as much as it
changes or affects the observed. This relativism creates the epistemological uncertainty we see in error-
counting methods, which, after all, attempt to shoehorn observations into numerical objectivity. Most social
observers or error coders will have felt this uncertainty at one time or another. Was this a procedure error, or a
proficiency error, or both? Or was it perhaps no error at all? Was this the cause, or was it the consequence?
If it is up to Kant, not having felt this uncertainty would serve as an indication of being a particularly obtuse
observer. It would certainly not be proof of the epistemological astuteness of either method or error counter.
The uncertainty suffered by them is epistemological because it is realized that certainty about what we know,
or even about how to know whether we know it or not, seems out of reach.
Yet those within the ruling paradigm have their stock answer to this challenge, just as they have it whenever
confronted with problems of bringing observations and theories in closer correspondence. More
methodological agreement and refinement, including observer training and standardization, may close the
uncertainty. Better trained observers will be able to distinguish between a procedure error and proficiency
error, and an improvement to the coding categories may also do the job. Similar modernist approaches
have had remarkable success for five centuries, so there is no reason to doubt that they may offer routes to
some progress even here. Or is there?
Perhaps more method may not solve problems seemingly linked to method. Consider a study reported by
Hollnagel and Amalberti (2001), whose purpose was to test an error-measurement instrument. This instrument
was designed to help collect data on, and get a better understanding of, air-traffic controller errors, and to
identify areas of weakness and find 53 DOCTORS AND GUN OWNERS possibilities for improvement. The
method asked observers to count errors (primarily error rates per hour) and categorize the types of errors
using a taxonomy proposed by the developers. The tool had already been used to pick apart and categorize
errors from past incidents, but would now be put to test in a real-time field setting—applied by pairs of
psychologists and air-traffic controllers who would study air-traffic control work going on in real time. The
observing air traffic controllers and psychologists, both trained in the error taxonomy, were instructed to take
note of all the errors they could see. Despite common indoctrination, there were substantial differences
between the numbers of errors each of the two groups of observers noted, and only a very small number of
errors were actually observed by both. People watching the same performance, using the same tool to classify
behavior, came up with totally different error counts. Closer inspection of the score sheets revealed that the
air-traffic controllers and psychologists tended to use different subsets of the error types available in the tool,
indicating just how negotiable the notion of error is:
The same fragment of performance means entirely different things to two different (but similarly trained and
standardized) groups of observers. Air-traffic controllers relied on external working conditions (e.g., interfaces,
personnel and time resources) to refer to and categorize errors, whereas psychologists preferred to locate the
error somewhere in presumed quarters of the mind (e.g., working memory) or in some mental state (e.g.,
attentional lapses). Moreover, air-traffic controllers who actually did the work could tell both groups of error
coders that they both had it wrong. Debriefing sessions exposed how many observed errors were not errors at
all to those said to have committed them, but rather normal work, expressions of deliberate strategies intended
to manage problems or foreseen situations that the error counters had either not seen, or not understood if
they had. Croft (2001) reported the same result in observations of cockpit errors: More than half the errors
revealed by error counters were never discovered by the flight crews themselves. Some realists may argue
that the ability to discover errors that people themselves do not see is a good thing: It confirms the strength or
superiority of method. But in Hollnagel and Amalberti's (2001) case, error coders were forced to disavow such
claims to epistemological privilege (and embrace ontological relativism instead). They reclassified the errors as
normal actions, rendering the score sheets virtually devoid of any error counts.
Early transfers of aircraft were not an error, for example, but turned out to correspond to a deliberate strategy
connected to a controller's foresight, planning ahead, and workload management. Rather than an expression
of weakness, such strategies uncovered sources of robustness that would never have come out, or would
even have been misrepresented and mischaracterized, with just the data in the classification tool. Such
normalization of 54 CHAPTER 3 actions, which at first appear deviant from the outside, is a critical aspect to
really understanding human work and its strengths and weaknesses (see Vaughan, 1996). Without
understanding such processes of normalization, it is impossible to penetrate the situated meaning of errors or
violations.
Classification of errors crumbles on the inherent weakness of the naïve realism that underlies it. The realist
idea is that errors are "out there," that they exist and can be observed, captured, and documented
independently of the observer. This would mean that it makes no difference who does the observing (which it
patently does). Such presumed realism is naive because all observations are ideational—influenced (or made
possible in the first place) to a greater or lesser extent by who is doing the observing and by the worldview
governing those observations. Realism does not work because it is impossible to separate the observer from
the observed. Acknowledging some of these problems, the International Civil Aviation Organization
(ICAO, 1998) has called for the development of human performance datacollection methods that do not rely on
subjective assessments. But is this possible? Is there such a thing as an objective observation of another
human's behavior?
The Presumed Reality of Error
The test of the air-traffic control error counting method reveals how "an action should not be classified as an
'error' only based on how it appears to an observer" (Hollnagel & Amalberti, 2001, p. 13). The test confirms
ontological relativism.
Yet sometimes the observed "error" should be entirely non-controversial, should it not? Take the spoiler
example from chapter 1. The flight crew forgot to arm the spoilers. They made a mistake. It was an error. You
can apply the new view to human error, and explain all about context and situation and mitigating factors.
Explain why they did not arm the spoilers, but that they did not arm the spoilers is a fact. The error occurred.
Even multiple different observers would agree on that. The flight crew failed to arm the spoilers. How can one
not acknowledge the existence of that error? It is there, it is a fact, staring us in the face.
But what is a fact? Facts always privilege the ruling paradigm. Facts always favor current interpretations, as
they fold into existing constructed renderings of what is going on. Facts actually exist by virtue of the current
paradigm.
They can neither be discovered nor given meaning without it. There is no such thing as observations without a
paradigm; research in the absence of a particular worldview is impossible. In the words of Paul Feyerabend
(1993, p. 11): "On closer analysis, we even find that science knows no 'bare facts' at all, but that the 'facts' that
enter our knowledge are already viewed in a certain way and are, therefore, essentially ideational." Feyera-
55 DOCTORS AND GUN OWNERS bend called the idea that facts are available independently and can
thereby objectively favor one theory over another, the autonomy principle (p. 26).
The autonomy principle asserts that the facts that are available as empirical content of one theory (e.g.,
procedural errors as facts that fit the threat and error model) are objectively available to alternative theories
too. But this does not work. As the spoiler example from chapter 1 showed, errors occur against and because
of a background, in this case a background so systemic, so structural, that the original human error pales
against it. The error almost becomes transparent, it is normalized, it becomes invisible. Against
this backdrop, this context of procedures, timing, engineering trade-offs, and weakened hydraulic systems, the
omission to arm the spoilers dissolves. Figure and ground trade places: No longer is it the error that is really
observable or even at all interesting. With deeper investigation, ground becomes figure. The backdrop begins
to take precedence as the actual story, subsuming, swallowing the original error. No longer can the error be
distinguished as a singular, failed decision moment. Somebody who applies a theory of naturalistic decision
making will not see a procedure error. What will be seen instead is a continuous flow of actions and
assessments, coupled and mutually cued, a flow with nonlinear feedback loops and interactions,
inextricably embedded in a multilayered evolving context. Human interaction with a system, in other words, is
seen as a continuous control task. Such a characterization is hostile to the digitization necessary to fish out
individual human errors.
Whether individual errors can be seen depends on the theory used. There are no objective observations of
facts. Observers in error counting are themselves participants, participating in the very creation of the
observed fact, and not just because they are there, looking at how other people are working. Of course,
through their sheer presence, error counters probably distort people's normal practice, perhaps turning situated
performance into a mere window-dressed posture. More fundamentally, however, observers in error counting
are participants, because the facts they see would not exist without them. They are created through the
method. Observers are participants because it is impossible to separate observer and object.
None of this, by the way, makes the procedure error less real to those who observe it. This is the whole point
of ontological relativism. But it does mean that the autonomy principle is false. Facts are not stable aspects of
an independent reality, revealed to scientists who wield the right instruments and methods. The discovery and
description of every fact is dependent on a particular theory. In the words of Einstein, it is the theory that
determines what can be seen. Facts are not available "out there," independent of theory.
To suppose that a better theory should come along to account for procedure errors in a way that more closely
matches reality is to stick with a 56 CHAPTER 3 model of scientific progress that was disproved long ago. It
follows the idea that theories should not be dismissed until there are compelling reasons to do so, and
compelling reasons arise only because there is an overwhelming number of facts that disagree with the theory.
Scientific work, in this idea, is the clean confrontation of observed fact with theory. But this is not how it
works, for those facts do not exist without the theory.
Resisting Change: The Theory Is Right. Or Is It?
The idea of scientific (read: theoretical) progress through the accumulation of observed disagreeing facts that
ultimately manage to topple a theory also does not work because counterinstances (i.e., facts that disagree
with the theory) are not seen as such. Instead, if observations reveal counterinstances (such as errors that
resist unique classification in any of the categories of the error-counting method), then researchers tend to see
these as further puzzles in the match between observation and theory (Kuhn, 1962)—puzzles that can be
addressed by further refinement of their method. Counterinstances, in other words, are not seen as speaking
against the theory. According to Kuhn (1962), one of the defining responses to paradigmatic crisis is that
scientists do not treat anomalies as counterinstances, even though that is what they are. It is extremely difficult
for people to renounce the paradigm that has led them into a crisis. Instead, the epistemological difficulties
suffered by error-counting methods (Was this a cause or a consequence? Was this a procedural or a
proficiency error?) are dismissed as minor irritants and reasons to engage in yet more methodological
refinement consonant with the current paradigm.
Neither scientists nor their supporting communities in industry are willing to forego a paradigm until and unless
there is a viable alternative ready to take its place. This is among the most sustained arguments surrounding
the continuation of error counting: Researchers engaging in error classification are willing to acknowledge that
what they do is not perfect, but vow to keep going until shown something better. And industry concurs. As
Kuhn pointed out, the decision to reject one paradigm necessarily coincides with the acceptance of another.
Proposing a viable alternative theory that can deal with its own facts, however, is exceedingly difficult, and has
proven to be so even historically (Feyerabend, 1993). Facts, after all, privilege the status quo. Galileo's
telescopic observations of the sky generated observations that motivated an alternative explanation about the
place of the earth in the universe. His observations favored the Copernican heliocentric interpretation (where
the earth goes around the sun) over the Ptolomeic geocentric one (where the 57 DOCTORS AND GUN
OWNERS sun goes around the earth). The Copernican interpretation, however, was a worldview away from
what was currently accepted, and many doubted Galileo's data as a valid empirical window on that heliocentric
reality. People were highly suspicious of the new instrument: Some asked Galileo to open up his telescope to
prove that there was no little moon hiding inside of it. How, otherwise, could the moon or any other celestial
body be seen so closely if it was not itself hiding in the telescope? One problem was that Galileo did not offer a
theoretical explanation for why this could be so, and why the telescope was supposed to offer a better picture
of the sky than the naked eye. He could not, because relevant theories (optica) were not yet well developed.
Generating better data (like Galileo did), and developing entirely new methods for better access to these data
(such as a telescope), does in itself little to dislodge an established theory that allows people to see the
phenomenon with their naked eye and explain it with their common sense. Similarly, people see the error
happen with their naked eye, even without the help of an error-classification method: The pilot fails to arm the
spoilers. Even their common sense confirms that this is an error. The sun goes around the earth. The earth is
fixed. The Church was right, and Galileo was wrong.
None of his observed facts could prove him right, because there was no coherent set of theories ready to
accommodate his facts and give them meaning. The Church was right, as it had all the facts. And it had the
theory to deal with them. Interestingly, the Church kept closer to reason as it was defined at the time. It
considered the social, political, and ethical implications of Galileo's alternatives and deemed them too risky to
accept—certainly on the grounds of tentative, rickety evidence. Disavowing the geocentric idea would be
disavowing Creation itself, removing the common ontological denominator of the past millennium and severely
undermining the authority and political power the Church derived from it. Error-classification methods too,
guard a piece of rationality that most people in industry and elsewhere would be loathe to see disintegrate.
Errors occur, they can be distinguished objectively. Errors can be an indication of unsafe performance.
There is good performance and bad performance; there are identifiable causes for why people perform well or
less well and for why failures happen. Without such a supposedly factual basis, without such hopes of an
objective rationality, traditional and well-established ways for dealing with threats to safety and trying to create
progress could collapse. Cartesian anxiety would grip the industry and research community. How can we hold
people accountable for mistakes if there are no "errors"?
How can we report safety occurrences and maintain expensive incident-reporting schemes if there are no
errors? What can we fix if there are no causes for adverse events? Such questions fit a broader class of
appeals against relativism.
Postmod-58 CHAPTER 3 ernism and relativism, according to their detractors, can lead only to moral
ambiguity, nihilism and lack of structural progress. We should instead hold onto the realist status quo, and we
can, for most observed facts still seem to privilege it.
Errors exist. They have to. To the naive realist, the argument that errors exist is not only natural and
necessary, it is also quite impeccable, quite forceful.
The idea that errors do not exist, in contrast, is unnatural, even absurd. Those within the established paradigm
will challenge the sheer legitimacy of questions raised about the existence of errors, and by implication even
the legitimacy of those who raise the questions: "Indeed, there are some psychologists who would deny the
existence of errors altogether. We will not pursue that doubtful line of argument here" (Reason & Hobbs, 2003,
p. 39). Because the current paradigm judges it absurd and unnatural, the question about whether errors exist
is not worth pursuing: It is doubtful and unscientific and in the strictest sense (when scientific pursuits are
measured and defined within the ruling paradigm), that is precisely what it is. If some scientists do not succeed
in bringing statement and fact into closer agreement (they do not see a procedure error where others would),
then this discredits the scientist rather than the theory. Galileo suffered from this too. It was the scientist who
was discredited (for a while at least), not the prevailing paradigm. So what does he do? How does Galileo
proceed once he introduces an interpretation so unnatural, so absurd, so countercultural, so revolutionary?
What does he do when he notices that even the facts are not (interpreted to be) on his side? As Feyerabend
(1993) masterfully described it, Galileo engaged in propaganda and psychological trickery. Through imaginary
conversations between Sagredo, Salviati, and Simplicio, written in his native tongue rather than in Latin, he put
the ontological uncertainty and epistemological difficulty of the geocentric interpretation on full display.
The sheer logic of the geocentric interpretation fell apart whereas that of the heliocentric interpretation
triumphed. Where the appeal to empirical facts failed (because those facts will still be forced to fit the
prevailing paradigm rather than its alternative), an appeal to logic may still succeed. The same is true for error
counting and classification. Just imagine this conversation: Simplicio: Errors result from physiological and
psychological limitations of humans. Causes of error include fatigue, workload, and fear, as well as cognitive
overload, poor interpersonal communications, imperfect information processing, and flawed decision making.
Sagredo: But are errors in this case not simply the result of other errors?
Flawed decision making would be an error. But in your logic, it causes an error. What is the error then? And
how can we categorize it?
59 DOCTORS AND GUN OWNERS
Simplicio: Well, but errors are caused by poor decisions, failures to adhere to brief, failures to prioritize
attention, improper procedure, and so forth. Sagredo: This appears to be not causal explanation, but simply
relabeling. Whether you say error, or poor decision, or failure to prioritize attention, it all still sounds like error,
at least when interpreted in your worldview. And how can one be the cause of the other to the exclusion of
the other way around? Can errors cause poor decisions just like poor decisions cause errors? There is nothing
in your logic that rules this out, but then we end up with a tautology, not an explanation. And yet, such
arguments may not help either. The appeal to logic may fail in the face of overwhelming support for a ruling
paradigm—support that derives from consensus authority, from political, social, and organizational
imperatives rather than a logical or empirical basis (which is, after all, pretty porous).
Even Einstein expressed amazement at the common reflex to rely on measurements (e.g., error counts) rather
than logic and argument: " 'Is it not really strange,' he asked in a letter to Max Born, 'that human beings are
normally deaf to the strongest of argument while they are always inclined to overestimate measuring
accuracies?' " (Feyerabend, 1993, p. 239). Numbers are strong. Arguments are weak. Error counting is good
because it generates numbers, it relies on accurate measurements (recall Croft, 2001, who announced that
"researchers" have "perfected" ways to monitor pilot performance), rather than on argument.
In the end, no argument, none of this propaganda or psychological trickery can serve as a substitute for the
development of alternative theory, nor did it in Galileo's case. The postmodernists are right and the realists are
wrong: Without a paradigm, without a worldview, there are no facts. People will reject no theory on the basis of
argument or logic alone. They need another to take its place. A paradigmatic interregnum would produce
paralysis. Suspended in a theoretical vacuum, researchers would no longer be able to see facts or do anything
meaningful with them. So, considering the evidence, what should the alternative theory look like? It needs to
come with a superior explanation of performance variations, with an interpretation that is sensitive to the
situatedness of the performance it attempts to capture. Such a theory sees no errors, but rather performance
variations—inherently neutral changes and adjustments in how people deal with complex, dynamic situations.
This theory will resist coming in from the outside, it will avoid judging other people from a position external to
how the situation looked to the subject inside of it. The outlines of such a theory are developed further in
various places in this book.
60 CHAPTER 3
SAFETY AS MORE THAN ABSENCE OF NEGATIVES
First, though, another question: Why do people bother with error counts in the first place? What goals do they
hope these empirical measures help them accomplish, and are there better ways to achieve those goals? A
final aim of error counting is to help make progress on safety, but this puts the link between errors and safety
on trial. Can the counting of negatives (e.g., these errors) say anything useful about safety? What does the
quantity measured (errors) have to do with the quality managed (safety)?
Error-classification methods assume a close mapping between these two, and assume that an absence or
reduction of errors is synonymous with progress on safety. By treating safety as positivistically measurable,
error counting may be breathing the scientific spirit of a bygone era. Human performance in the laboratory was
once gauged by counting errors, and this is still done when researchers test limited, contrived task behavior in
spartan settings. But how well does this export to natural settings where people carry out actual complex,
dynamic, and interactive work, where determinants of good and bad outcomes are deeply confounded?
It may not matter. The idea of a realist count is compelling to industry for the same reasons that any numerical
performance measurement is. Managers get easily infatuated with "balanced scorecards" or other faddish
figures of performance. Entire business models depend on quantifying performance results, so why not
quantify safety? Error counting becomes yet another quantitative basis for managerial interventions. Pieces of
data from the operation that have been excised and formalized away from their origin can be converted into
graphs and bar charts that subsequently form the inspiration for interventions. This allows managers, and their
airlines, to elaborate their idea of control over operational practice and its outcomes. Managerial control,
however, exists only in the sense of purposefully formulating and trying to influence the intentions and actions
of operational people (Angell & Straub, 1999). It is not the same as being in control of the consequences (by
which safety ultimately gets measured industry-wide), because for that the real world is too complex and
operational environments too stochastic (e.g., Snook, 2000).
There is another tricky aspect of trying to create progress on safety through error counting and classification.
This has to do with not taking context into regard when counting errors. Errors, according to realist
interpretations, represent a kind of equivalent category of bad performance (e.g., a failure to meet one's
objective or intention), no matter who commits the error or in what situation. Such an assumption has to exist,
otherwise tabulation becomes untenable. One cannot (or should not) add apples and oranges, after all. If both
apples and oranges are entered into the method (and, given that the autonomy principle is false, error-counting
61 DOCTORS AND GUN OWNERS methods do add apples and oranges), silly statistical tabulations that
claim doctors are 7,500 times more dangerous than gun owners can roll out the other end. As Hollnagel and
Amalberti (2001) showed, attempts to map situated human capabilities such as decision making, proficiency,
or deliberation onto discrete categories are doomed to be misleading. They cannot cope with the complexity of
actual practice without serious degeneration (Angell & Straub, 1999). Error classification disembodies data. It
removes the context that helped produce the behavior in its particular manifestation.
Such disembodiment may actually retard understanding. The local rationality principle (people's behavior is
rational when viewed from the inside of their situations) is impossible to maintain when context is removed
from the controversial action. And error categorization does just that: It removes context. Once the observation
of some kind of error is tidily locked away into some category, it has been objectified, formalized away from the
situation that brought it forth. Without context, there is no way to reestablish local rationality. And without local
rationality, there is no way to understand human error. And without understanding human error, there may be
no way to learn how to create progress on safety.
Safety as Reflexive Project
Safety is likely to be more than the measurement and management of negatives (errors), if it is that at all. Just
as errors are epistemologically elusive (How do you know what you know? Did you really see a procedure
error? Or was it a proficiency error?), and ontologically relativist (what it means "to be" and to perform well or
badly inside a particular situation is different from person to person), the notion of safety may similarly lack an
objective, common denominator. The idea behind measuring safety through error counts is that safety is some
kind of objective, stable (and perhaps ideal) reality, a reality that can be measured and reflected, or
represented, through method. But does this idea hold? Rochlin (1999, p. 1550), for example, proposed that
safety is a "constructed human concept" and others in human factors have begun to probe how individual
practitioners construct safety, by assessing what they understand risk to be, and how they perceive their ability
of managing challenging situations (e.g., Orasanu, 2001).
A substantial part of practitioners' construction of safety turns out to be reflexive, assessing the person's own
competence or skill in maintaining safety across different situations. Interestingly, there may be a mismatch
between risk salience (how critical a particular threat to safety was perceived to be by the practitioner) and
frequency of encounters (how often these threats to safety are in fact met in practice). The safety threats
deemed most salient were the ones least frequently dealt with (Orasanu, 2001). Safety is more akin to a
reflexive project, sustained through a revisable narrative of 62 CHAPTER 3 self-identity that develops in the
face of frequently and less frequently encountered risks. It is not something referential, not something that is
objectively "out there" as a common denominator, open to any type of approximation by those with the best
methods. Rather, safety may be reflexive: something that people relate to themselves. The numbers produced
by error counts are a logical endpoint of a structural analysis that focuses on (supposed) causes and
consequences, an analysis that defines risk and safety instrumentally, in terms of minimizing errors and
presumably measurable consequences. A second, more recent approach is more socially and politically
oriented, and places emphasis on representation, perception, and interpretation rather than on structural
features (Rochlin, 1999).
The managerially appealing numbers generated by error counts do not carry any of this reflexivity, none of the
nuances of what it is to "be there," doing the work, creating safety on the line. What it is to be there ultimately
determines safety (as outcome): People's local actions and assessments are shaped by their own
perspectives. These in turn are embedded in histories, rituals, interactions, beliefs and myths, both of people's
organization and organizational subculture and of them as individuals.
This would explain why good, objective, empirical indicators of social and organizational definitions of safety
are difficult to obtain. Operators of reliable systems "were expressing their evaluation of a positive state
mediated by human action, and that evaluation reflexively became part of the state of safety they were
describing" (Rochlin, 1999, p. 1550). In other words, the description itself of what safety means to an individual
operator is a part of that very safety, dynamic and subjective. "Safety is in some sense a story a group or
organization tells about itself and its relation to its task environment" (Rochlin, p. 1555).
Can We Measure Safety?
But how does an organization capture what groups tell about themselves; how does it pin down these stories?
How can management measure a mediated, reflexive idea? If not through error counts, what can an
organization look for in order to get some measure of how safe it is? Large recent accidents provide some
clues of where to start looking (e.g., Woods, 2003). A main source of residual risk in otherwise safe
transportation systems is the drift into failure described in chapter 2. Pressures of scarcity and competition
narrow an organization's focus on goals associated with production. With an accumulating base of empirical
success (i.e., no accidents, even if safety is increasingly traded off against other goals such as maximizing
profit or capacity utilization), the organization, through its members' multiple little and larger daily decisions, will
begin to believe that past success is 63 DOCTORS AND GUN OWNERS a guarantee of future safety, that
historical success is a reason for confidence that the same behavior will lead to the same (successful)
outcome the next time around.
The absence of failure, in other words, is taken as evidence that hazards are not present, that
countermeasures already in place are effective. Such a model of risk is embedded deeply in the reflexive
stories of safety that Rochlin (1999) talked about, and it can be made explicit only through qualitative
investigations that probe the interpretative aspect of situated human assessments and actions. Error counts do
little to elucidate any of this. More qualitative studies could reveal how currently traded models of risk may
increasingly be at odds with the actual nature and proximity of hazard, though it may of course be difficult to
establish the objective, or ontologically absolutist, presence of hazard.
Particular aspects of how organization members tell or evaluate safety stories, however, can serve as markers.
Woods (2003, p. 5), for example, has called one of these markers "distancing through differencing." In this
process, organizational members look at other failures and other organizations as not relevant to them and
their situation. They discard other events because they appear at the surface to be dissimilar or distant.
Discovering this through qualitative inquiry can help specify how people and organizations reflexively create
their idea, their story of safety. Just because the organization or section has different technical problems,
different managers, different histories, or can claim to already have addressed a particular safety concern
revealed by the event, does not mean that they are immune to the problem. Seemingly divergent events can
represent similar underlying patterns in the drift towards hazard.
High-reliability organizations characterize themselves through their preoccupation with failure: continually
asking themselves how things can go wrong and could have gone wrong, rather than congratulating
themselves on the fact that things went right. Distancing through differencing means underplaying this
preoccupation It is one way to prevent learning from events elsewhere, one way to throw up obstacles
in the flow of safety-related information. Additional processes that can be discovered include to what extent an
organization resists oversimplifying interpretations of operational data, whether it defers to expertise and
expert judgment rather than managerial imperatives. Also, it could be interesting to probe to what extent
problem-solving processes are fragmented across organizational departments, sections, or subcontractors.
The 1996 Valujet accident, where flammable oxygen generators were placed in an aircraft cargo hold without
shipping caps, subsequently burning down the aircraft, was related to a web of subcontractors that together
made up the virtual airline of Valujet. Hundreds of people within even one subcontractor logged work against
the particular Valujet aircraft, and this subcontractor was only one of many players in a network of
organizations and companies tasked with different aspects 64 CHAPTER 3 of running (even constituting) the
airline. Relevant maintenance parts (among them the shipping caps) were not available at the subcontractor,
ideas of what to do with expired oxygen canisters were generated ad hoc in the absence of central guidance,
and local understandings for why shipping caps may have been necessary were foggy at best. With work and
responsibility for it distributed among so many participants, nobody may have been able anymore to see the
big picture, including the regulator. Nobody may have been able to recognize the gradual erosion of safety
constraints on the design and operation of the original system. If safety is a reflexive project rather than an
objective datum, human factors researchers must develop entirely new probes for measuring the safety
health of an organization.
Error counts do not suffice. They uphold an illusion of rationality and control, but may offer neither real insight
nor productive routes for progress on safety. It is, of course, a matter of debate whether the vaguely defined
organizational processes that could be part of new safety probes (e.g., distancing through differencing,
deference to expertise, fragmentation of problem-solving, incremental judgments into disaster) are any more
real than the errors from the counting methods they seek to replace or augment. But then, the reality of these
phenomena is in the eye of the beholder: Observer and observed cannot be separated; object and subject are
largely indistinguishable. The processes and phenomena are real enough to those who look for them and who
wield the theories to accommodate the results. Criteria for success may lie elsewhere, for example in how well
the measure maps onto past evidence of precursors to failure.
Yet even such mappings are subject to paradigmatic interpretations of the evidence base. Indeed, consonant
with the ontological relativity of the age human factors has now entered, the debate can probably never be
closed. Are doctors more dangerous than gun owners?
Do errors exist? It depends on who you ask. The real issue, therefore, lies a step away from the fray, a level
up, if you will. Whether we count errors as Durkheimian fact on the one hand or see safety as a reflexive
project on the other, competing premises and practices reflect particular models of risk. These models of risk
are interesting not because of their differential abilities to access empirical truth (because that may all be
relative), but because of what they say about us, about human factors and system safety. It is not the
monitoring of safety that we should simply pursue, but the monitoring of that monitoring. If we want to make
progress on safety, one important step is to engage in such metamonitoring, to become better aware of the
models of risk embodied in our assumptions and approaches to safety.
4 Chapter
Don't Errors Exist?
Human factors as a discipline takes a very realist view. It lives in a world of real things, of facts and concrete
observations. It presumes the existence of an external world in which phenomena occur that can be captured
and described objectively. In this world there are errors and violations, and these errors and violations are
quite real. The flight-deck observer from chapter 3, for example, would see that pilots do not arm the spoilers
before landing and marks this up as an error or a procedural violation. The observer considers his observation
quite true, and the error quite real. Upon discovering that the spoilers had not been armed, the pilots
themselves too may see their omission as an error, as something that they missed but should not have
missed. But just as it did for the flight-deck observer, the error becomes real only because it is visible from
outside the stream of experience. From the inside of this stream, while things are going on and work is being
accomplished, there is no error. In this case there are only procedures that get inadvertently mangled through
the timing and sequence of various tasks. And not even this gets noticed by those applying the procedures.
Recall how Feyerabend (1993) pointed out that all observations are ideational, that facts do not exist without
an observer wielding a particular theory that tells him or her what to look for. Observers are not passive
recipients, but active creators of the empirical reality they encounter. There is no clear separation between
observer and observed. As said in chapter 3, none of this makes the error any less real to those who observe
it. But it does not mean that the error exists out there, in some independent empirical universe.
This was the whole point of ontological relativism: What it means to be in a particular situation and make
certain observations is quite flexible and connected systematically to the observer. None of the possible
worldviews can be judged superior or privileged uniquely by empirical data about the world, because objective,
impartial access to that world is impossible. Yet in the pragmatic and optimistically realist spirit of human
factors, error counting methods have gained popularity by selling the belief that such impartial access is
possible. The claim to privileged access lies (as modernism and Newtonian science would dictate) in method.
The method is strong enough to discover errors that the pilots themselves had not seen.
Errors appear so real when we step or set ourselves outside the stream of experience in which they occur.
They appear so real to an observer sitting behind the pilots. They appear so real to even the pilot himself after
the fact. But why? It cannot be because the errors are real, since the autonomy principle has been proven
false. As an observed fact, the error only exists by virtue of the observer and his or her position on the outside
of the stream of experience. The error does not exist because of some objective empirical reality in which it
putatively takes place, since there is no such thing and if there was, we could not know it. Recall the air-traffic
control test of chapter 3: Actions, omissions, and postponements related to air-traffic clearances carry entirely
different meanings for those on the inside and on the outside of the work experience. Even different observers
on the outside cannot agree on a common denominator because they have diverging backgrounds and
conceptual looking glasses. The autonomy principle is false: facts do not exist without an observer. So why do
errors appear so real?
ERRORS ARE ACTIVE, CORRECTIVE INTERVENTIONS IN HISTORY
Errors are an active, corrective intervention in (immediate) history. It is impossible for us to give a mere
chronicle of our experiences: Our assumptions, past experiences and future aspirations cause us to impress a
certain organization on that which we just went through or saw. Errors are a powerful way to impose structure
onto past events. Errors are a particular way in which we as observers (or even participants) reconstruct the
reality we just experienced. Such reconstruction, however, inserts a severe discontinuity between past and
present. The present was once an uncertain, perhaps vanishingly improbable, future. Now we see it as the
only plausible outcome of a pretty deterministic past. Being able to stand outside an unfolding sequence
of events (either as participants from hindsight or as observers from outside the setting) makes it exceedingly
difficult to see how unsure we once were (or could have been if we had been in that situation) of what was
going to happen. History as seen through the eyes of a retrospective out-sider (even if the same observer was
a participant in that history not long ago) is substantially different from the world as it appeared to the decision
makers of the day.
This endows history, even immediate history, with a determinism it lacked when it was still unfolding. Errors,
then, are expost facto constructs. The research base on the hindsight bias contains some of the strongest
evidence on this. Errors are not empirical facts. They are the result of outside observers squeezing nowknown
events into the most plausible or convenient deterministic scheme. In the research base on hindsight, it is not
difficult to see how such retrospective restructuring embraces a liberal take on the history it aims to recount.
The distance between reality as portrayed by a retrospective observer and as experienced by those who were
there (even if these were once the same people) grows substantially with the rhetoric and discourse
employed and the investigative practices used. We see a lot of this later in the discussion.
We also look at developments in psychology that have (since not so long ago) tried to get away from the
normativist bias in our understanding of human performance and decision making. This intermezzo is
necessary because errors and violations do not exist without some norm, even if implied.
Hindsight of course has a powerful way of importing criteria or norms from outside people's situated contexts,
and highlighting where actual performance at the time fell short. To see errors as expost constructs rather than
as objective, observed facts, we have to understand the influence of implicit norms on our judgments of past
performance. Doing without errors means doing with normativism. It means that we cannot question the
accuracy of insider accounts (something human factors consistently does, e.g., when it asserts a "loss of
situation awareness"), as there is no objective, normative reality to hold such accounts up to, and relative to
which we can deem them accurate or inaccurate.
Reality as experienced by people at the time was reality as it was experienced by them at the time, full stop. It
was that experienced world that drove their assessments and decisions, not our (or even their) retrospective,
outsider rendering of that experience. We have to use local norms of competent performance to understand
why what people did made sense to them at the time.
Finally, an important question we must look ahead to: Why is it that errors fulfill such an important function in
our reconstructions of history, of even our own histories? Seeing errors in history may actually have little to do
with historical explanation. Rather, it may be about controlling the future. What we see toward the end of this
chapter is that the hindsight bias may not at all be about history, and may not even be a bias. Retrospective
reconstruction, and the hindsight bias, should not be seen as the primary phenomenon. Rather, it represents
and serves a larger purpose, answering a highly pragmatic concern. The almost inevitable urge to highlight
past choice moments (where people went the wrong way), the drive to identify errors, is forward looking, not
backward looking. The hindsight bias may not be a bias because it is an adaptive response, an
oversimplification of history that primes us for complex futures and allows us to project simple models of past
lessons onto those futures, lest history repeat itself. This means that retrospective recounting tells us much
more about the observer than it does about reality—if there is such an objective thing.
Making Tangled Histories Linear
The hindsight bias (Fischoff, 1975) is one of the most consistent biases in psychology. One effect is that
"people who know the outcome of a complex prior history of tangled, indeterminate events, remember that
history as being much more determinant, leading 'inevitably' to the outcome they already knew" (Weick, 1995,
p. 28). Hindsight allows us to change past indeterminacy and complexity into order, structure, and
oversimplified causality (Reason, 1990). As an example, take the turn towards the mountains that a Boeing
757 made just before an accident near Cali, Colombia in 1995. According to the investigation, the crew did not
notice the turn, at least not in time (Aeronautica Civil, 1996). What should the crew have seen in order to know
about the turn? They had plenty of indications, according to the manufacturer of their aircraft:
Indications that the airplane was in a left turn would have included the following: the EHSI (Electronic
Horizontal Situation Indicator) Map Display (if selected) with a curved path leading away from the intended
direction of flight; the EHSI VOR display, with the CDI (Course Deviation Indicator) displaced to the right,
indicating the airplane was left of the direct Cali VOR course, the EaDI indicating approximately 16 degrees of
bank, and all heading indicators moving to the right. Additionally the crew may have tuned Rozo in the ADF
and may have had bearing pointer information to Rozo NDB on the RMDI. (Boeing Commercial Airplane
Group, 1996, p. 13)
This is a standard response after mishaps: Point to the data that would have revealed the true nature of the
situation. In hindsight, there is an overwhelming array of evidence that did point to the real nature of the
situation, and if only people had paid attention to even some of it, the outcome would have been different.
Confronted with a litany of indications that could have prevented the accident, we wonder how people at the
time could not have known all of this We wonder how this "epiphany" was missed, why this bloated shopping
bag full of revelations was never opened by the people who most needed it. But knowledge of the critical data
comes only with the omniscience of hindsight. We can only know what really was critical or highly relevant
once we know the outcome. Yet if data can be shown to have been physically available, we often assume that
it should have been picked up by the practitioners in the situation. The problem is that pointing out that
something should have been noticed does not explain why it was not noticed, or why it was interpreted
differently back then. This confusion has to do with us, not with the people we are investigating. What we, in
our reaction to failure, fail to appreciate is that there is a dissociation between data availability and data
observability—between what can be shown to have been physically available and what would have been
observable given people's multiple interleaving tasks, goals, attentional focus, expectations, and interests.
Data, such as the litany of indications in the previous example, do not reveal themselves to practitioners in one
big monolithic moment of truth. In situations where people do real work, data can get drip-fed into the
operation: a little bit here, a little bit there. Data emerges over time. Data may be uncertain.
Data may be ambiguous. People have other things to do too. Sometimes the successive or multiple data bits
are contradictory, often they are unremarkable. It is one thing to say how we find some of these data important
in hindsight. It is quite another to understand what the data meant, if anything, to the people in question at the
time. The same kind of confusion occurs when we, in hindsight, get an impression that certain assessments
and actions point to a common condition. This may be true at first sight. In trying to make sense of past
performance, it is always tempting to group individual fragments of human performance that seem to share
something, that seem to be connected in some way, and connected to the eventual outcome. For example,
"hurry" to land was such a leitmotif extracted from the evidence in the Cali investigation.
Haste in turn is enlisted to explain the errors that were made:
Investigators were able to identify a series of errors that initiated with the flightcrew's acceptance of the
controller's offer to land on runway 19 ... The CVR (Cockpit Voice Recorder) indicates that the decision to
accept the offer to land on runway 19 was made jointly by the captain and the first officer in a 4-second
exchange that began at 2136:38. The captain asked: "would you like to shoot the one nine straight in?" The
first officer responded, 'Yeah, we'll have to scramble to get down. We can do it." This interchange followed an
earlier discussion in which the captain indicated to the first officer his desire to hurry the arrival into Cali,
following the delay on departure from Miami, in an apparent to minimize the effect of the delay on the flight
attendants' rest requirements. For example, at 2126:01, he asked the first officer to "keep the speed up in the
descent" . . . (This is) evidence of the hurried nature of the tasks performed. (Aeronautica Civil, 1996, p. 29) In
this case the fragments used to build the argument of haste come from over half an hour of extended
performance.
Outside observers have treated the record as if it were a public quarry to pick stones from, and the accident
explanation the building he needs to erect. The problem is that each fragment is meaningless outside the
context that produced it: Each fragment has its own story, background, and reasons for being, and when it was
produced it may have had nothing to do with the other fragments it is now grouped with. Moreover, behavior
takes place in between the fragments.
These intermediary episodes contain changes and evolutions in perceptions and assessments that separate
the excised fragments not only in time, but also in meaning. Thus, the condition, and the constructed linearity
in the story that binds these performance fragments, does not arise from the circumstances that brought each
of the fragments forth; it is not a feature of those circumstances. It is an artifact of the outside observer. In the
case just described, hurry is a condition identified in hindsight, one that plausibly couples the start of the flight
(almost 2 hours behind schedule) with its fatal ending (on a mountainside rather than an airport). Hurry is a
retrospectively invoked leitmotif that guides the search for evidence about itself.
It leaves the investigator with a story that is admittedly more linear and plausible and less messy and complex
than the actual events. Yet it is not a set of findings, but of tautologies.
Counterfactual Reasoning
Tracing the sequence of events back from the outcome—that we as outside observers already know about—
we invariably come across joints where people had opportunities to revise their assessment of the situation but
failed to do so, where people were given the option to recover from their route to trouble, but did not take it.
These are counterfactuals—quite common in accident analysis. For example, "The airplane could have
overcome the windshear encounter if the pitch attitude of 15 degrees nose-up had been maintained, the thrust
had been set to 1.93 EPR (Engine Pressure Ratio) and the landing gear had been retracted on schedule"
(NTSB, 1995, p. 119). Counterfactuals prove what could have happened if certain minute and often Utopian
conditions had been met.
Counterfactual reasoning may be a fruitful exercise when trying to uncover potential countermeasures against
such failures in the future. But saying what people could have done in order to prevent a particular outcome
does not explain why they did what they did. This is the problem with counterfactuals. When they are enlisted
as explanatory proxy, they help circumvent the hard problem of investigations: finding out why people did what
they did. Stressing what was not done (but if it had been done, the accident would not have happened)
explains nothing about what actually happened, or why. In addition, counterfactuals are a powerful tributary to
the hindsight bias. They help us impose structure and linearity on tangled prior histories. Counterfactuals can
convert a mass of indeterminate actions and events, themselves overlapping and interacting, into a linear
series of straightforward bifurcations. For example, people could have perfectly executed the go-around
maneuver but did not; they could have denied the runway change but did not. As the sequence of events rolls
back into time, away from its outcome, the story builds. We notice that people chose the wrong prong at each
fork, time and again—ferrying them along inevitably to the outcome that formed the starting point of our
investigation (for without it, there would have been no investigation).
But human work in complex, dynamic worlds is seldom about simple dichotomous choices (as in: to err or not
to err). Bifurcations are extremely rare especially those that yield clear previews of the respective outcomes
at each end. In reality, choice moments (such as there are) typically reveal multiple possible pathways that
stretch out, like cracks in a window, into the ever denser fog of futures not yet known. Their outcomes are
indeterminate, hidden in what is still to come. In reality, actions need to be taken under uncertainty and under
the pressure of limited time and other resources.
What from the retrospective outside may look like a discrete, leisurely two-choice opportunity to not fail, is from
the inside really just one fragment caught up in a stream of surrounding actions and assessments.
In fact, from the inside it may not look like a choice at all. These are often choices only in hindsight. To the
people caught up in the sequence of events, there was perhaps not any compelling reason to reassess their
situation or decide against anything (or else they probably would have) at the point the investigator has now
found significant or controversial. They were likely doing what they were doing because they thought they were
right, given their understanding of the situation, their pressures. The challenge for an investigator becomes to
understand how this may not have been a discrete event to the people whose actions are under investigation.
The investigator needs to see how other people's decisions to continue were likely nothing more than
continuous behavior—reinforced by their current understanding of the situation, confirmed by the cues they
were focusing on, and reaffirmed by their expectations of how things would develop.
Judging Instead of Explaining
When outside observers use counterfactuals, even as explanatory proxy, they themselves often require
explanations as well. After all, if an exit from the route to trouble stands out so clearly to outside observers,
how was it possible for other people to miss it? If there was an opportunity to recover, to not crash, then failing
to grab it demands an explanation. The place where observers often look for clarification is the set of rules,
professional standards, and available data that surrounded people's operation at the time, and how people did
not see or meet that which they should have seen or met. Recognizing that there is a mismatch between what
was done or seen and what should have been done or seen as per those standards we easily judge people for
not doing what they should have done. Where fragments of behavior are contrasted with written guidance that
can be found to have been applicable in hindsight, actual performance is often found wanting; it does not live
up to procedures or regulations. For example, "One of the pilots ... executed [a computer entry] without having
verified that it was the correct selection and without having first obtained approval of the other pilot, contrary to
procedures" (Aeronautica Civil, 1996, p.31). Investigations invest considerably in organizational archeology so
that they can construct the regulatory or procedural framework within which the operations took place, or
should have taken place. Inconsistencies between existing procedures or regulations and actual behavior are
easy to expose when organizational records are excavated after the fact and rules uncovered that would have
fit this or that particular situation.This is not, however, very informative. There is virtually always a mismatch
between actual behavior and written guidance that can be located in hindsight. Pointing out a mismatch sheds
little light on the why of the behavior in question, and, for that matter, mismatches between procedures and
practice are not unique to mishaps. There are also less obvious or undocumented standards. These are often
invoked when a controversial fragment (e.g., a decision to accept a runway change, Aeronautica Civil, 1996,
or the decision to go around or not, NTSB, 1995) knows no clear preordained guidance but relies on local,
situated judgment. For these cases there are always supposed standards of good practice, based on
convention and putatively practiced across an entire industry. One such standard in aviation is "good
airmanship," which, if nothing else can, will explain the variance in behavior that had not yet been accounted
for. When micromatching, observers frame people's past assessments and actions inside a world that they
have invoked retrospectively. Looking at the frame as overlay on the sequence of events, they see that pieces
of behavior stick out in various places and at various angles: a rule not followed here, available data not
observed there, professional standards not met over there. But rather than explaining controversial fragments
in relation to the circumstances that brought them forth, and in relation to the stream of preceding as well as
succeeding behaviors that surrounded them, the frame merely boxes performance fragments inside a world
observers now know to be true. The problem is that this after-the-fact-world may have very little relevance to
the actual world that produced the behavior under study. The behavior is contrasted against the observer's
reality, not the reality surrounding the behavior at the time. Judging people for what they did not do relative to
some rule or standard does not explain why they did what they did. Saying that people failed to take this or that
pathway—only in hindsight the right one judges other people from a position of broader insight and outcome
knowledge that they themselves did not have. It does not explain a thing yet; it does not shed any light on
why people did what they did given their surrounding circumstances. Outside observers have become caught
in what William James called the "psychologist's fallacy" a century ago: They have substituted their own reality
for the one of their object of study.
The More We Know, the Less We Understand
We actually have interesting expectations of new technology in this regard. Technology has made it
increasingly easy to capture and record the reality that surrounded other people carrying out work. In
commercial aviation, the electronic footprint that any flight produces is potentially huge.
We can use these data to reconstruct the world as it must have been experienced by other people back then,
potentially avoiding the psychologist's fallacy. But capturing such data addresses only one side of the problem.
Our ability to make sense of these data, to employ them in a reconstruction of the sensemaking processes of
other people at another time and place, has not kept pace with our growing technical ability to register traces of
their behavior. In other words, the presumed dominance of human factors in incidents and accidents is not
matched by our ability to analyze or understand the human contribution for what it is worth.
Data used in accident analysis often come from a recording of human voices and perhaps other sounds
(ruffling charts, turning knobs), which can be coupled to a greater or lesser extent with contemporaneous
system or process behavior. A voice trace, however, represents only a partial data record. Human behavior in
rich, unfolding settings is much more than the voice trace it leaves behind. The voice trace always points
beyond itself, to a world that was unfolding around the practitioners at the time, to tasks, goals, perceptions,
intentions, thoughts, and actions that have since evaporated. But most investigations are formally restricted in
how they can couple the cockpit voice recording to the world that was unfolding around the practitioners (e.g.,
instrument indications, automation-mode settings). In aviation, for example, International Civil Aviation
Organization (ICAO Annex 13) prescribes how only those data that can be factually established may be
analyzed in the search for cause. This provision often leaves the cockpit voice recording as only a factual,
decontextualized, and impoverished footprint of human performance. Making connections between the voice
trace and the circumstances and people in which it was grounded quickly falls outside the pale of official
analysis and into the realm of what many would call inference or speculation. This inability to make clear
connections between behavior and world straightjackets any study of the human contribution to a cognitively
noisy, evolving sequence of events. ICAO Annex 13 thus regulates the disembodiment of data: Data must be
studied away from their context, for the context and the connections to it are judged as too tentative, too
abstract, too unreliable. Such a provision, contradicted by virtually all cognitive psychological research, is
devastating to our ability to make sense of puzzling performance.
Apart from the provisions of ICAO Annex 13, this problem is complicated by the fact that current flight-data
recorders (FDRs) often do not capture many automation-related traces: precisely those data that are of
immediate importance to understanding the problem-solving environment in which most pilots today carry out
their work. For example, FDRs in many highly automated aircraft do not record which ground-based navigation
beacons were selected by the pilots, which automation-mode control-panel selections on airspeed, heading,
altitude, and vertical speed were made, or what was shown on both pilots' moving map displays. As operator
work has shifted to the management and supervision of a suite of automated resources, and problems leading
to accidents increasingly start in human-machine interactions, this represents a large gap in our ability to
access the reasons for human assessments and actions in modern operational workplaces.
INVERTING PERSPECTIVES
Knowing about and guarding against the psychologist's fallacy, against the mixing of realities, is critical to
understanding error. When looked at from the position of retrospective outsider, the error can look so very real,
so compelling. They failed to notice, they did not know, they should have done this or that. But from the point
of view of people inside the situation, as well as potential other observers, this same error is often nothing
more than normal work. If we want to begin to understand why it made sense for people to do what they did,
we have to reconstruct their local rationality. What did they know? What was their understanding of the
situation? What were their multiple goals, resource constraints, pressures? Behavior is rational within
situational contexts: People do not come to work to do a bad job.
As historian Barbara Tuchman put it: "Every scripture is entitled to be read in the light of the circumstances that
brought it forth. To understand the choices open to people of another time, one must limit oneself to what they
knew; see the past in its own clothes, as itwere, notin ours" (1981, p. 75).
This position turns the exigent social and operational context into the only legitimate interpretive device. This
context becomes the constraint on what meaning we, who were not there when it happened, can now give to
past controversial assessments and actions. Historians are not the only ones to encourage this switch, this
inversion of perspectives, this persuasion to put ourselves in the shoes of other people. In hermeneutics it is
known as the difference between exegesis (reading out of the text) and eisegesis (reading into the text). The
point is to read out of the text what is has to offer about its time and place, not to read into the text what we
want it to say or reveal now. Jens Rasmussen points out that if we cannot find a satisfactory answer to
questions such as "how could they not have known?", then this is not because these people were behaving
bizarrely. It is because we have chosen the wrong frame of reference for understanding their behavior
(Vicente, 1999).
The frame of reference for understanding people's behavior is their own normal, individual work context, the
context they are embedded in and from whose point of view the decisions and assessments made are mostly
normal, daily, unremarkable, perhaps even unnoticeable. A challenge is to understand how assessments and
actions that from the outside look like errors become neutralized or normalized so that from the inside
they appear unremarkable, routine, normal. If we want to understand why people did what they did, then the
adequacy of the insider's representation of the situation cannot be called into question. The reason is that
there are no objective features in the domain on which we can base such a judgment. In fact, as soon as we
make such a judgment, we have imported criteria from the outside—from another time and place, from another
rationality. Ethnographers have always championed the point of view of the person on the inside.
Like Rasmussen, Emerson advised that, instead of using criteria from outside the setting to examine mistake
and error, we should investigate and apply local notions of competent performance that are honored and used
in particular social settings (Vaughan, 1999). This excludes generic rules and motherhoods (e.g.,
"pilots should be immune to commercial pressures"). Such putative standards ignore the subtle dynamics of
localized skills and priority setting, and run roughshod over what would be considered "good" or "competent" or
"normal" from inside actual situations. Indeed, such criteria impose a rationality from the outside, impressing a
frame of context-insensitive, idealized concepts of practice upon a setting where locally tailored and subtly
adjusted criteria rule instead.
The ethnographic distinction between etic and emic perspectives was coined in the 1950s to capture the
difference between how insiders view a setting and how outsiders view it. Emic originally referred to the
language and categories used by people in the culture studied, whereas etic language and categories were
those of outsiders (e.g., the ethnographer) based on their analysis of important distinctions. Today, emic is
often understood to be the view of the world from the inside out, that is, how the world looks from the eyes of
the person studied. The point of ethnography is to develop an insider's view of what is happening, an inside-
out view. Etic is contrasted as the perspective from the outside in, where researchers or observers attempt
to gain access to some portions of an insider's knowledge through psychological methods such as surveys or
laboratory studies.
Emic research considers the meaning-making activities of individual minds. It studies the multiple realities that
people construct from their experiences with their empirical reality. It assumes that there is no direct access
to a single, stable, and fully knowable external reality. Nobody has this access. Instead, all understanding of
reality is contextually embedded and limited by the local rationality of the observer. Emic research points at the
unique experience of each human, suggesting that any observer's way of making sense of the world is as valid
as any other, and that there are no objective criteria by which this sensemaking can be judged correct or
incorrect. Emic researchers resist distinguishing between objective features of a situation, and subjective ones.
Such a distinction distracts the observer from the situation as it looked to the person on the inside, and in fact
distorts this insider perspective.
A fundamental concern is to capture and describe the point of view of people inside a system or situation, to
make explicit that which insiders take for granted, see as common sense, find unremarkable or normal. When
we want to understand error, we have to embrace ontological relativity not out of philosophical intransigence or
philanthropy, but for trying to get the inside-out view. We have to do this for the sake of learning what makes a
system safe or brittle. As we saw in chapter 2, for example, the notion of what constitutes an incident (i.e.,
what is worthy of reporting as a safety threat) is socially constructed, shaped by history, institutional
constraints, cultural and linguistic notions. It is negotiated among insiders in the system. None of the structural
measures an organization takes to put an incident reporting system in place will have any effect if insiders do
not see safety threats as incidents that are worth sending into the reporting system. Nor will the organization
ever really improve reporting rates if it does not understand the notion of incident (and, conversely, the notion
of normal practice) from the point of view of the people who do it everyday.
To succeed at this, outsiders need to take the inside-out look, they need to embrace ontological relativity, as
only this can crack the code to system safety and brittleness. All the processes that set complex systems onto
their drifting paths toward failure—the conversion of signals of danger into normal, expected problems, the
incrementalist borrowing from safety, the assumption that past operational success is a guarantee of future
safety—are sustained through implicit social-organizational consensus, driven by insider language and
rationalizations. The internal workings of these processes are simply impervious to outside inspection, and
thereby numb to external pressure for change. Outside observers cannot attain an emic perspective, nor can
they study the multiple rationalities created by people on the inside if they keep seeing errors and violations.
Outsiders can perhaps get some short-term leverage by (re) imposing context-insensitive rules, regulations,
or exhortations and making moral appeals for people to follow them, but the effects are generally short-lived.
Such measures cannot be supported by operational ecologies.
There, actual practice is always under pressure to adapt in an open system, exposed to pressures of scarcity
and competition. It will once again inevitably drift into niches that generate greater operational returns at no
apparent cost to safety.
ERROR AND (IR)RATIONALITY
Understanding error against the background of local rationality, or rationality for that matter, has not been an
automatic by-product of studying the psychology of error. In fact, research into human error had a very
rationalist bias up to the 1970s (Reason, 1990), and in some quarters in psychology and human factors such
rationalist partiality has never quite disappeared.
Rationalist means that mental processes can be understood with reference to normative theories that describe
optimal strategies. Strategies may be optimal when the decision maker has perfect, exhaustive access to all
relevant information, takes time enough to consider it all, and applies clearly defined goals and preferences to
making the final choice. In such cases, errors are explained by reference to deviations from this rational norm,
this ideal. If the decision turns out wrong it may be because the decision maker did not take enough time to
consider all information, or that he or she did not generate an exhaustive set of choice alternatives to pick
from. Errors, in other words, are deviant. They are departures from a standard.
Errors are irrational in the sense that they require a motivational (as opposed to cognitive) component in their
explanation. If people did not take enough time to consider all information, it is because they could not be
bothered to. They did not try hard enough, and they should try harder next time, perhaps with the help of some
training or procedural guidance. Investigative practice in human factors is still rife with such rationalist reflexes.
It did not take long for cognitive psychologists to find out how humans could not or should not even behave like
perfectly rational decision makers. Whereas economists clung to the normative assumptions of decision
making (decision makers have perfect and exhaustive access to information for their decisions, as well as
clearly defined preferences and goals about what they want to achieve), psychology, with the help of artificial
intelligence, posited that there is no such thing as perfect rationality (i.e., full knowledge of all relevant
information, possible outcomes, relevant goals), because there is not a single cognitive system in the world
(neither human nor machine) that has sufficient computational capacity to deal with it all.
Rationality is bounded. Psychology subsequently started to chart people's imperfect, bounded, or local
rationality. Reasoning, it discovered, is governed by people's local understanding, by their focus of attention,
goals, and knowledge, rather than some global ideal. Human performance is embedded in, and systematically
connected to, the situation in which it takes place: It can be understood (i.e., makes sense) with reference to
that situational context, not by reference to some universal standard. Human actions and assessments can be
described meaningfully only in reference to the localized setting in which they are made; they can be
understood by intimately linking them to details of the context that produced and accompanied them. Such
research has given rationality an interpretive flexibility:
What is locally rational does not need to be globally rational. If a decision is locally rational, it makes sense
from the point of view of the decision maker, which is what matters if we want to learn about the underlying
reasons for what from the outside looks like error. The notion of local rationality removes the need to rely on
irrational explanations of error. Errors make sense: They are rational, if only locally so, when seen from the
inside of the situation in which they were made.
But psychologists themselves often have trouble with this. They keep on discovering biases and aberrations in
decision making (e.g., groupthink, confirmation bias, routine violations) that seem hardly rational, even from
within a situational context. These deviant phenomena require motivational explanations. They call for
motivational solutions. People should be motivated to do the right thing, to pay attention, to double check. If
they do not, then they should be reminded that it is their duty, their job. Notice how easily we slip back into
prehistoric behaviorism: Through a modernist system of rewards and punishments (job incentives, bonuses,
threats of retribution) we hope to mold human performance after supposedly fixed features of the world.
That psychologists continue to insist on branding such action as irrational, referring it back to some
motivational component, may be due to the limits of the conceptual language and power of the discipline.
Putatively motivational issues (such as deliberately breaking rules) must themselves be put back into context,
to see how human goals (getting the job done fast by not following all the rules to the letter) are made
congruent with system goals through a collective of subtle pressures, subliminal messages about
organizational preferences, and empirical success of operating outside existing rules.
The system wants fast turnaround times, maximization of capacity utilization, efficiency. Given those system
goals (which are often kept implicit) , rulebreaking is not a motivational shortcoming, but rather an indication
of a well-motivated human operator: Personal goals and system goals are harmonized, which in turn can lead
to total system goal displacement: Efficiency is traded off against safety. But psychology often keeps seeing
motivational shortcomings. And human factors keeps suggesting countermeasures such as injunctions to
follow the rules, better training, or more top-down task analysis. Human factors has trouble incorporating the
subtle but powerful influences of organizational environments, structures, processes, and tasks into accounts
of individual cognitive practices. In this regard the discipline is conceptually underdeveloped. Indeed, how
unstated cultural norms and values travel from the institutional, organizational level to express themselves in
individual assessments and actions (and vice versa) is a concern central to sociology, not human factors.
Bridging this macromicro connection in the systematic production of rule violations means understanding
the dynamic interrelationships between issues as wide ranging as organizational characteristics and
preferences, its environment and history; incrementalism in trading safety off against production, unintentional
structural secrecy that fragments problem-solving activities across different groups and departments; patterns
and representations of safety-related information that are used as imperfect input to organizational decision
making, the influence of hierarchies and bureaucratic accountability on people's choice, and others (e.g.,
Vaughan, 1996, 1999). The structuralist lexicon of human factors and system safety today has no words for
many of these concepts, let alone models for how they go together.
From Decision Making to Sensemaking
In another move away from rationalism and toward the inversion of perspectives (i.e., trying to understand the
world the way it looked to the decision maker at the time), large swathes of human factors have embraced the
ideas of naturalistic decision making (NDM) over the last decade. By importing cyclical ideas about cognition
(situation assessment informs action, which changes the situation, which in turn updates assessment, Neisser,
1976) into a structuralist, normativist psychological lexicon, NDM virtually reinvented decision making
(Orasanu & Connolly, 1993). The focus shifted from the actual decision moment, back into the preceding realm
of situation assessment.
This shift was accompanied by a methodological reorientation, where decision making and decision makers
were increasingly studied in their complex, natural environments. Real decision problems, it quickly turned
out, resist the rationalistic format dictated for so long by economics: Options are not enumerated exhaustively,
access to information is incomplete at best, and people spend more time assessing and measuring up
situations than making decisions—if that is indeed what they do at all (Klein, 1998). In contrast to the
prescriptions of the normative model, decision makers tend not to generate and evaluate several courses of
action concurrently, in order to then determine the best choice. People do not typically have clear or stable
sets of preferences along which they can even rank the enumerated courses of action, picking the best one,
nor do most complex decision problems actually have a single correct answer. Rather, decision makers in
action tend to generate single options at the time, mentally simulate whether this option would work in practice,
and then either act on it, or move on to a new line of thought. NDM also takes the role of expertise more
seriously than previous decision-making paradigms: What distinguishes good decision makers from bad
decision makers most is their ability to make sense of situations by using a highly organized experience base
of relevant knowledge.
Once again neatly folding into ideas developed by Neisser, such reasoning about situations is more schema-
driven, heuristic, and recognitional than it is computational. The typical naturalistic decision setting does not
allow the decision maker enough time or information to generate perfect solutions with perfectly rational
calculations. Decision making in action calls for judgments under uncertainty, ambiguity and time pressure. In
those settings, options that appear to work are better than perfect options that never get computed.
The same reconstructive, corrective intervention into history that produces our clear perceptions of errors, also
generates discrete decisions. What we see as decision making from the outside is action embedded in
larger streams of practice, something that flows forth naturally and continually from situation assessment and
reassessment. Contextual dynamics are a joint product of how problems in the world are developing and the
actions taken to do something about it. Time becomes nonlinear: Decision and action are interleaved rather
than temporally segregated. The decision maker is thus seen as in step with the continuously unfolding
environment, simultaneously influenced by it and influencing it through his or her next steps.
Understanding decision making, then, requires an understanding of the dynamics that lead up to those
supposed decision moments, because by the time we get there, the interesting phenomena have evaporated,
gotten lost in the noise of action.
NDM research is front-loaded: it studies the front end of decision making, rather than the back end. It is
interested, indeed, in sensemaking more than in decision making.
Removing decision making from the vocabulary of human factors investigations is the logical next step,
suggested by Snook (2000). It would be an additional way to avoid counterfactual reasoning and
judgmentalism, as decisions that eventually led up to a bad outcome all too quickly become bad
decisions:
Framing such tragedies as decisions immediately focuses our attention on an individual making choices . . .
such a framing puts us squarely on a path that leads straight back to the individual decision maker, away from
the potentially powerful contextual features and right back into the jaws of the fundamental attribution error.
"Why did they decide . . . ?" quickly becomes "Why did they make the wrong decision?" Hence, the attribution
falls squarely onto the shoulders of the decision maker and away from potent situational factors that
influence action. Framing the . . . puzzle as a question of meaning rather than deciding shifts the emphasis
away from individual decision makers toward a point somewhere "out there" where context and individual
action overlap. (Snook, p. 206)
Yet sensemaking is not immune to counterfactual pressure either. If what made sense to the person inside the
situation still makes no sense given the outcome, then human factors hastens to point that out (see chap. 5).
Even in sensemaking, the normativist bias is an everpresent risk.
THE HINDSIGHT BIAS IS NOT A BIAS AND IS NOT ABOUT THE PAST
Perhaps the pull in the direction of the position of retrospective outsider is irresistible, inescapable, whether we
make lexical adjustments in our investigative repertoire or not. Even with the potentially judgmental notion of
decision making removed from the forensic psychological toolbox, it remains incredibly difficult to see the past
in its own clothes, not in ours. The fundamental attribution error is alive and well, as Scott Snook puts it (2000,
p. 205). We blame the human in the loop and underestimate the influence of context on performance, despite
repeated warnings of this frailty in our reasoning. Perhaps we are forever unable to shed our own projection of
reality onto the circumstances of people at another time and place. Perhaps we are doomed to digitizing past
performance, chunking it up into discrete decision moments that inevitably lure us into counterfactual thinking
and judgments of performance instead of explanations. Just as any act of observation changes the observed,
our very observations of the past inherently intervene in reality, converting complex histories into more linear,
more certain, and disambiguated chronicles.
The mechanisms described earlier in this chapter may explain how hindsight influences our treatment of
human performance data, but they hardly explain why. They hardly shed light on the energy behind the
continual pull toward the position of retrospective outsider; they merely sketch out some of the routes that lead
to it. In order to explain failure, we seek failure. In order to explain missed opportunities and bad choices, we
seek flawed analyses, inaccurate perceptions, violated rules—even if these were not thought to be influential
or obvious or even flawed at the time (Starbuck & Milliken, 1988). This search for failures is something we
cannot seem to escape. It is enshrined in the accident models popular in transportation human factors of our
age (see 82 CHAPTER 4 chaps. 1 and 2) and proliferated in the fashionable labels for "human error" that
human factors keeps inventing (see chap. 6). Even where we turn away from the etic pitfalls of looking into
people's decision making, and focus on a more benign, emic, situated sensemaking, the rationalist, normativist
perspective is right around the corner. If we know the outcome was bad, we can no longer objectively look at
the behavior leading up to it must also have been bad (Fischoff, 1975). To get an idea, think of the Greek
mythological figure Oedipus, who shared Jocasta's bed.
How large is the difference between Oedipus' memory of that experience before and after he found out that
Jocasta was his mother? Once he knew, it was simply
impossible for him to look at the experience the same way. What had he missed? Where did he not do his
homework? How could he have become so distracted? Outcome knowledge afflicts all retrospective observers,
no matter how hard we try not to let it influence us. It seems that bad decisions always have something in
common, and that is that they all seemed like a good idea at the time. But try telling that to Oedipus.
The Hindsight Bias Is an Error That Makes Sense, Too
When a phenomenon is so impervious to external pressure to change, one would begin to suspect that it has
some adaptive value, that it helps us preserve something, helps us survive. Perhaps the hindsight bias is not a
bias, and perhaps it is not about history. Instead, it may be a highly adaptive, forward looking, rational
response to failure. This putative bias may be more about predicting the future than about explaining the past.
The linearization and simplification that we see happen in the hindsight bias may be a form of abstraction that
allows us to export and project our and others' experiences onto future situations. Future situations can never
be predicted at the same level of contextual detail as the new view encourages us to explain past situations.
Predictions are possible only because we have created some kind of model for the situation we wish to gain
control over, not because we can exhaustively foresee every contextual factor, influence, or data point. This
model—any model—is an abstraction away from context, an inherent simplification. The model we create
naturally, effortlessly, automatically after past events with a bad outcome inevitably becomes a model of binary
choices, bifurcations, and unambiguous decision moments. That is the only useful kind of model we can take
with us into the future if we want to guard against the same type of pitfalls and forks in the road.
The hindsight bias, then, is about learning, not about explaining.
It is forward looking, not backward looking. This applies to ourselves and our own failures as much as it applies
to our observations of other people's failures. When confronted by failures that occurred to other people, we
may imperatively be tripped into vicarious learning, spurned by our own urge for survival: What do I do to avoid
that from happening to me? When confronted by our own performance, we have no privileged insight into our
own failures, even if we would like to think we do. The past is the past, whether it is our own or somebody
else's. Our observations of the past inevitably intervene and change the observed, no matter whose past it is.
This is something that the fundamental attribution error cannot account for. It explains how we overestimate
the influence of stable, personal characteristics when we look at other people's failings. We underplay the
influence of context or situational factors when others do bad things.
But what about our own failings? Even here we are susceptible to reframing past complexity as simple binary
decisions, wrong decisions due to personal shortcomings: things we missed, things we should have done or
should not have done. Snook (2000) investigated how, in the fog of post Gulf War Iraq, two helicopters
carrying U.N. peacekeepers were shot down by American fighter jets. The situation in which the shoot-down
occurred was full of risk, role ambiguity, operational complexity, resource pressure, slippage between plans
and practice. Yet immediately after the incident, all of this gets converted into binary simplicity (a choice to err
or not to err) by DUKE—the very command onboard the airborne control center whose job it was not to have
such things happen. Allowing the fighters to shoot down the helicopters was their error, yet they do not blame
context at all, as the fundamental attribution error predicts they should. It was said of the DUKE that
immediately after the incident: "he hoped we had not shot down our own helicopters and that he couldn't
believe anybody could make that dumb a mistake" (Snook, p. 205). It is DUKE himself who blames his own
dumb mistake. As with the errors in chapter 3, the dumb mistake is something that jumps into view only with
knowledge of outcome, its mistakeness a function of the outcome, its dumbness a function of the severity of
the consequences.
While doing the work, helping guide the fighters, identifying the targets, all DUKE was doing was his job. It was
normal work. He was not sitting there making dumb mistakes. They are a product of hindsight, his own
hindsight, directed at his own "mistakes." The fundamental attribution error does not apply. It is overridden.
The fighter pilots, too, engage in self-blame, literally converting the ambiguity, risk, uncertainty, and pressure
of their encounter with potentially hostile helicopters into a linear series of decision errors, where they
repeatedly and consistently took wrong turns on their road to perdition (we misidentified, we engaged, and we
destroyed): "Human error did occur. We misidentified the helicopters; we engaged them; and we destroyed
them. It was a tragic and fatal mistake" (Tiger 02 quoted in Snook, 2000, p. 205).
Again, the fundamental attribution error makes the wrong prediction. If it were true, then these fighter pilots
would tend to blame context for their own errors. Indeed, it was a rich enough context—fuzzy, foggy,
dangerous, multi-player, pressurized, risky—with plenty of blameworthy factors to go around, if that is where
you would look. Yet these fighter pilots do not. "We" misidentified, "we" engaged, "we" destroyed. The pilots
had the choice not to; in fact, they had a series of three choices not to instigate a tragedy.
But they did. Human error did occur. Of course, elements of self-identity and control are wrapped up in such an
attribution, a self-identity for which fighter pilots may well be poster children.
It is interesting to note that the tendency to convert past complexity into binary simplicity—into twofold choices
to identify correctly or incorrectly, to engage or not, to destroy or not—overrides the fundamental attribution
error. This confirms the role of the hindsight bias as a catalyst for learning. Learning (or having learned)
expresses itself most clearly by doing something differently in the future, by deciding or acting differently, by
removing one's link in the accident chain, as fighter pilot Tiger 02 put it: "Remove any one link in the chain and
the outcome would be entirely different. I wish to God I could go back and correct my link in this chain my
actions which contributed to this disaster" (Tiger 02, quoted in Snook, 2000, p. 205).
We cannot undo the past. We can only undo the future.
But undoing the future becomes possible only when we have abstracted away past failures, when we have
decontextualized them, stripped them, cleaned them from the fog and confusion of past contexts, highlighted
them, blown them up into obvious choice moments that we, and others, had better get right next time around.
Prima facie, the hindsight bias is about misassessing the contributions of past failings to bad outcomes. But if
the phenomenon is really as robust as it is documented to be and if it actually manages to override the
fundamental attribution error, it is probably the expression of more primary mechanisms running right beneath
its surface. The hindsight bias is a meaningful adaptation. It is not about explaining past failures. It is about
preventing future ones. In preparing for future confrontations with situations where we or others might err
again, and do not want to, we are in some sense taking refuge from the banality of accidents thesis.
The thought that accidents emerge from murky, ambiguous, everyday decision making renders us powerless
to do anything meaningful about it. This is where the hindsight bias is so fundamentally adaptive. It highlights
for us where we could fix things (or where we think we could fix things), so that the bad thing does not happen
again. The hindsight bias is not a bias at all, in the sense of a departure from some rational norm. The
hindsight bias is rational. It in itself represents and sustains rationality. We have to see the past as a binary
choice, or a linear series of binary choices, because that is the only way we can have any hope of controlling
the future. There is no other basis for learning, for adapting. Even if those adaptations may consist of rather
coarse adjustments, of undamped and overcontrolling regulations, even if these adaptations occur at the cost
of making oversimplified predictions. But making oversimplified predictions of how to control the future is
apparently better than having no predictions at all.
Quite in the spirit of Saint Augustine, we accept the reality of errors, and the guilt that comes with it, in the
quest for control over our futures. Indeed, the human desire to attain control over the future surely predates the
Scientific Revolution. The more refined and empirically testable tools for gaining such control, however, were
profoundly influenced and extended by it. Control could best be attained through mechanization and
technology away from nature, spirit, away from primitive incantations to divine powers to spare us the next
disaster. These Cartesian-Newtonian reflexes have tumbled down the centuries to proffer human factors
legitimate routes for gaining control over complex, dynamic futures today and tomorrow.
For example, when we look at the remnants of a crashed automated airliner, we, in hindsight, exclaim, "they
should have known they were in open descent mode!" The legitimate solution for meeting such technology
surprises is to throw more technology at the problem (additional warning systems, paperless cockpits,
automatic cocoons). But more technology often creates more problems of a kind we have a hard time
anticipating, rather than just solving existing ones.
As another example, take the error-counting methods discussed in chapter 3. A more formalized way of turning
the hindsight bias into an oversimplified forward looking future controller is hardly imaginable. Errors, which are
uniquely the product of retrospective observations conducted from the outside, are measured, categorized,
and tabulated. This produces bar charts that putatively point toward a future, jutting their dire predictions of rule
violations or proficiency errors out into a dark and fearful time to come, away from a presumed "safe" baseline.
It is normativism in pretty forms and colors.
These forecasting techniques, which are merely an assignment of categories and numbers to the future, are
appearing everywhere. However, their categorical and numerical output can at best be as adequate or as
inadequate as the input. Using such forecasts as a strategic tool is only a belief that numbers are meaningful in
relation to the fearful future. Strategy becomes a matter of controlling the future by labelling it, rather than
continually re-evaluating the uncertain situation. This approach, searching for the right and numerical label
to represent the future, is more akin to numerology or astrology. It is the modern-day ritual equivalent of
"reading the runes" or "divining the entrails." (Angell & Straub, 1999, p. 184)
Human factors holds on to the belief that numbers are meaningful in relation to a fearful future. And why not?
Measuring the present and mathematically modeling it (with barcharts, if you must), and thereby predicting
and controlling the future has been a legitimate pursuit since at least the 16th century. But as chapters 1, 2,
and 3 show, such pursuits are getting to be deeply problematic. In an increasingly complex, dynamic
sociotechnical world, their predictive power is steadily eroding. It is not only a problem of garbage in, garbage
out (the categorical and numerical output is as adequate or inadequate as the input). Rather, it is the problem
of not seeing that we face an uncertain situation in the first place, where mistake, failure, and disaster are
incubated in systems much larger, much less transparent, and much less deterministic than the counters of
individual errors believe. This, of course, is where the hindsight bias remains a bias.
But it is a bias about the future, not about the past. We are biased to believe that thinking about action in terms
of binary choices will help us undo bad futures, that it prepares us sufficiently for coming complexity. It does
not. Recall how David Woods (2003) put it: Although the past is incredible (DUKE couldn't believe anybody
could make that dumb a mistake), the future is implausible. Mapping digitized historic lessons of failure (which
span the arc from error bar charts to Tiger O2's wish to undo his link in the chain) into the future will only be
partly effective. Stochastic variation and complexity easily outrun our computational capacity to predict with
any accuracy.
PRESERVATION OF SELF- AND SYSTEM IDENTITY
There is an additional sense in which our dealings with past failures go beyond merely understanding what
went wrong and preventing recurrence. Mishaps are surprising relative to prevailing beliefs and assumptions
about the system in which they happen, and investigations are inevitably affected by the concern to reconcile a
disruptive event with existing views and beliefs about the system. Such reconciliation is adaptive too. Our
reactions to failure, and our investigations into failure, must be understood against the backdrop of the
"fundamental surprise error" (Lanir, 1986) and examined for the role they play in it. Accidents tend to create a
profound asymmetry between our beliefs (or hopes) of a basically safe system, and new evidence that may
suggest that it is not. This is the fundamental surprise: the astonishment that we feel when the most basic
assumptions we held true about the world may turn out to be untrue. The asymmetry creates a tension, and
this tension creates pressure for change: Something will have to give. Either the belief needs changing (i.e.,
we have to acknowledge that the system is not basically safe—that mistake, mishap, and disaster are
systematically organized by that system; Vaughan, 1996), or we change the people involved in the mishap
even if this means us. We turn them into unrepresentative, uniquely bad individuals:
The pilots of a large military helicopter that crashed on a hillside in Scotland in 1994 were found guilty of gross
negligence. The pilots did not survive—29 people died in total—so their side of the story could never be heard.
The official inquiry had no problems with "destroying the reputation of two good men," as a fellow pilot put it.
Potentially fundamental vulnerabilities (such as 160 reported cases of Uncommanded Flying Control
Movement or UFCM in computerized helicopters alone since 1994) were not looked into seriously.
(Dekker, 2002, p. 25)
When we elect to "destroy the reputation of two good men," we have committed the fundamental surprise
error. We have replaced a fundamental challenge to our assumptions, our beliefs (the fundamental surprise)
with a mere local one: The pilots were not as good as we thought they were, or as good as they should have
been. From astonishment (and its concomitant: fear about the basic safety of the system, as would be
raised by 160 cases of UFCM) we move to mere, local wonder: How could they not have seen the hill? They
must not have been very good pilots after all. Thus we strive to preserve our self- and system identity. We
pursue an adaptive strategy of safeguarding the essence of our world as we understand it. By letting the
reputation of individual decision makers slip, we have relieved the tension between broken beliefs (the system
is not safe after all) and fervent hopes that it still is.
That phenomena such as the hindsight bias and the fundamental attribution error may not be primary, but
rather ancillary expressions of more adaptive, locally rational, and useful identity-preserving strategies for the
ones committing them, is consonant with observations of a range of reasoning errors. People keep committing
them not because they are logical (i.e., globally rational) or because they only produce desired effects, but
because they serve an even weightier purpose: "This dynamic, this 'striving to preserve identity,' however
strange the means or effects of such striving, was recognised in psychiatry long ago. [This phenomenon] is
seen not as primary, but as attempts (however misguided) at restitution, at reconstructing a world reduced by
complete chaos" (Sacks, 1998, p. 7).
However "strange the means or effects of such striving," the fundamental surprise error allows us to rebuild a
world reduced by chaos. And the hindsight bias allows us to predict and avoid future roads to perdition.
Through the fundamental surprise error, we rehabilitate our faith in something larger than ourselves, something
in which we too are vulnerable to breakdown, something that we too are at the mercy of in varying degrees.
Breaking out of such locally rational reasoning, where the means and consequences of our striving for
preservation and rehabilitation create strange and undesirable side effects (blaming individuals for system
failures, not learning from accidents, etc.) requires extraordinary courage. It is not very common.
Yet people and institutions may not always commit the fundamental surprise error, and may certainly not do so
intentionally. In fact, in the immediate aftermath of failure, people may be willing to question their underlying
assumptions about the system they use or operate.
Perhaps things are not as safe as previously thought; perhaps the system contains vulnerabilities
and residual weaknesses that could have spawned this kind of failure earlier, or worse, could do it again. Yet
such openness does not typically last long. As the shock of an accident subsides, parts of the system mobilize
to contain systemic self-doubt and change the fundamental surprise into a merely local hiccup that temporarily
ruffled an otherwise smooth operation. The reassurance is that the system is basically safe. It is only some
people or other parts in it that are unreliable In the end, it is not often that an existing view of a system gives in
to the reality of failure. Instead, to redress the asymmetry, the event or the players in it are changed to fit
existing assumptions and beliefs about the system, rather than the other way around. Expensive lessons about
the system as a whole, and the subtle vulnerabilities it contains, can go completely unlearned.
Our inability to deal with the fundamental surprise of failure shines through the investigations we commission.
The inability to really learn is sometimes legitimized and institutionalized through resourceintensive official
investigations. The cause we end up attributing to an accident may sometimes be no more than the "cause" we
can still afford, given not just our financial resources, but also our complex of hopes and beliefs in a safe and
fair world. As Perrow (1984) has noted: Formal accident investigations usually start with an assumption that
the operator must have failed, and if this attribution can be made, that is the end of serious inquiry. Finding that
faulty designs were responsible would entail enormous shutdown and retrofitting costs; finding that
management was responsible would threaten those in charge, but finding that operators were responsible
preserves the system, with some soporific injunctions about better training, (p. 146)
Real change in the wake of failure is often slow to come. Few investigations have the courage to really
challenge our beliefs. Many keep feeding the hope that the system is still safe—except for this or that little
broken component, or this or that Bad Apple. The lack of courage shines through how we deal with human
error, through how we react to failure. It affects the words we choose, the rhetoric we rely on, the pathways for
"progress" we put our bets on.
Which cause can we afford? Which cause renders us too uncomfortable?
Accuracy is not the dominant criterion, but plausibility is plausibility from the perspective of those who have to
accommodate the surprise that the failure represents for them and their organization, their worldview. "Is it
plausible?," is the same as asking, "Can we live (on) with this explanation?
Does this explanation help us come to terms with the puzzle of bad performance?"
Answering this question, and generating such comfort and selfassurance, is one purpose that our analysis of
past failures has to fulfill, even if it becomes a selective oversimplification because of it. Even if, in the
words of Karl Weick (1995), they make lousy history
5 Chapter
If You Lose Situation Awareness, What Replaces It?
The hindsight bias has ways of getting entrenched in human factors thinking.
One such way is our vocabulary. "Losing situation awareness" or "deficient situation awareness" have become
legitimate characterizations of cases where people did not exactly know where they were or what was going
on around them. In many applied as well as some scientific settings, it is acceptable to submit "loss of situation
awareness" as an explanation for why people ended up where they should not have ended up, or why they did
what they, in hindsight, should not have done. Navigational incidents and accidents in transportation represent
one category of cases where the temptation to rely on situation awareness as an elucidatory construct appears
irresistible. If people end up where they should not, or where they did not intend to end up, it is easy to see that
as a deficient awareness of the cues and indications around them. It is easy to blame a loss of situation
awareness.
One such accident happened to the Royal Majesty, a cruise ship that was sailing from Bermuda to Boston in
the Summer of 1995. It had more than 1,000 people onboard. Instead of Boston, the Royal Majesty ended up
on a sandbank close to the Massachusetts shore. Without the crew noticing, it had drifted 17 miles off course
during a day and a half of sailing (see Fig. 5.1).
Investigators discovered afterward that the ship's autopilot had defaulted to DR (Dead Reckoning) mode (from
NAV, or Navigation mode) shortly after departure. DR mode does not compensate for the effects of wind and
other drift (waves, currents), which NAV mode does. A northeasterly wind pushed the ship steadily off its
course, to the side of its intended track. The U.S. National Transportation Safety Board investigation into the
FIG. 5.1. The difference between where the Royal Majesty crew thought they were headed (Boston) and
where they actually ended up: a sandbank near Nantucket accident judged that "despite repeated indications,
the crew failed to recognize numerous opportunities to detect that the vessel had drifted off track"
(NTSB, 1995, p. 34). But "numerous opportunities" to detect the nature of the real situation
become clear only in hindsight.
With hindsight, once we know the outcome, it becomes easy to pick out exactly those clues and indications
that would have shown people where they were actually headed. If only they had focused on this piece of data,
or put less confidence in that indication, or had invested just a little more energy in examining this anomaly,
then they would have seen that they were going in the wrong direction. In this sense, situation awareness is a
highly functional or adaptive term for us, for those struggling to come to terms with the rubble of a navigational
accident. Situation awareness is a notation that assists us in organizing the evidence available to people at the
time, and can provide a starting point for understanding why this evidence was looked at differently, or not at
all, by those people. Unfortunately, we hardly ever push ourselves to such understanding. Loss of situation
awareness is accepted as sufficient explanation too quickly too often, and in those cases amounts to nothing
more than saying human error under a fancy new label.
The kinds of notations that are popular in various parts of the situationawareness literature are one indication
that we quickly stop investigating, researching any further, once we have found human error under that new
guise. Venn-type diagrams, for example, can point out the mismatch between actual and ideal situation
awareness. They illustrate the difference between what people were aware of in a particular situation, and
what they could (or should) ideally have been aware of (see Fig. 5.2). Once we have found a mismatch
between what we now know about the situation (the large circle) and what people back then apparently knew
about it (the small one), that in itself is explanation enough. They did not know, but they could or should have
known. This does not apply only in retrospect, by the way. Even design problems can be clarified through this
notation, and performance predictions can be made on its basis.
When the aim is designing for situation awareness, the Venn diagram can show what people should
pick up in a given setting, versus what they are likely to actually pick up. In both cases, awareness is a
relationship between that which is objectively available to people in the outside world on the one hand, and
what they take in, or understand about it, on the other. Terms such as deficient situation awareness or loss of
situation awareness confirm human factors' dependence on a kind of subtractive model of awareness. The
Venn diagram notation can also be expressed in an equation that reflects this:
Loss of SA= /(large circle - small circle) (1) In Equation 1, "loss of SA" equals "deficient SA" and SA stands for
situation awareness. This also reveals the continuing normativist bias in our understanding of human
performance. Normativist theories aim to explain mental processes by reference to ideals or normative
standards that describe optimal strategies. The large Venn circle is the norm, the standard, the ideal.
Situation awareness is explained by reference to that ideal: Actual situation awareness is a subtraction from
that ideal, a shortfall, a deficit, indeed, a "loss." Equation 2 becomes: Loss of SA = /(what I know now - what
you knew then) (2)
Loss of situation awareness, in other words, is the difference between what I know about a situation now
(especially the bits highlighted by hindsight) and what somebody else apparently knew about that situation
then. Interestingly, situation awareness is nothing by itself, then. It can only be expressed as a relative,
normativist function, for example, the difference between what people apparently knew back then versus what
they could or should have known (or what we know now). Other than highlighting the mismatch between the
large and the little circle in a Venn diagram, other than revealing the elements that were not seen but could or
should have been seen and understood, there is little there. Discourse and research around situation
awareness may so far have shed little light on the actual processes of attentional dynamics. Rather, situation
awareness has given us a new normativist lexicon that provides large, comprehensive notations for
perplexing data (How could they not have noticed? Well, they lost SA).
Situation awareness has legitimized the proliferation of the hindsight bias under the pretext of adding to the
knowledge base. None of this may really help our understanding of awareness in complex dynamic situations,
instead truncating deeper research, and shortchanging real insight.
THE MIND-MATTER PROBLEM
Discourse about situation awareness is a modern installment of an ancient debate in philosophy and
psychology, about the relationship between matter and mind. Surely one of the most vexing problems, the
coupling between matter and mind, has occupied centuries of thinkers. How is it that we get data from the
outside world inside our minds? What are the processes by which this happens? And how can the products of
these processes be so divergent (I see other things than you, or other things than I saw yesterday)? All
psychological theories, including those of situation awareness, implicitly or explicitly choose a position relative
to the mindmatter problem.
Virtually all theories of situation awareness rely on the idea of correspondence a match, or correlation,
between an external world of stimuli (elements) and an internal world of mental representations (which gives
meaning to those stimuli). The relationship between matter and mind, in other words, is one of letting the mind
create a mirror, a mental simile, of matter on the outside. This allows a further elaboration of the Venn diagram
notation: Instead of "ideal" versus "actual" situation awareness, the captions of the circles in the diagram could
read "matter" (the large circle) and "mind" (the little circle). Situation awareness is the difference between what
is out there in the material world (matter) and what the observer sees or understands (mind). Equation 1 can
be rewritten as Equation 3: Loss of SA = /(matter - mind) (3)
Equation 3 describes how a loss of situation awareness, or deficient situation awareness, is a function of
whatever was in the mind subtracted from what was available matter. The portion of matter that did not make it
into mind is lost, it is deficient awareness. Such thinking, of course, is profoundly Cartesian. It separates mind
from matter as if they are distinct entities: a res cogitans and a res extensa. Both exist as separate essentials
of our universe, and one serves as the echo, or imitate, of the other.
One problem of such dualism lies, of course, in the assumptions it makes. An important assumption is what
Feyerabend called the "autonomy principle" (see chap. 3): that facts exist in some objective world, equally
accessible to all observers. The autonomy principle is what allows researchers to draw the large circle of the
Venn diagram: It consists of matter available to, as well as independent from, any observer, whose awareness
of that matter takes the form of an internal simile. This assumption is heavily contested by radical empiricists.
Is there really a separation between resextensa and res cogitans? Can we look at matter as something "out
there," as something independent of (the minds of) observers, as something that is open to enter the
awareness of anyone? If the autonomy principle is right, then deficient situation awareness is a result of the
actual SA Venn circle being too small, or misdirected relative to the large (ideal SA) Venn circle. But if you lose
situation awareness, what replaces it?
No theories of cognition today can easily account for a mental vacuum, for empty-headedness. Rather, people
always form an understanding of the situation unfolding around them, even if this understanding can, in
hindsight, be shown to have diverged from the actual state of affairs. This does not mean that this mismatch
has any relevance in explaining human performance at the time. For people doing work in a situated context,
there is seldom a mismatch, if ever. Performance is driven by the desire to construct plausible, coherent
accounts, a good story of what is going on. Weick (1995) reminded us that such sensemaking is not about
accuracy, about achieving an accurate mapping between some objective, outside world and an inner
representation of that world. What matters to people is not to produce a precise internalized simile of an
outside situation, but to account for their sensory experiences in a way that supports action and goal
achievement. This converts the challenge of understanding situation awareness.
From studying the mapping accuracy between an external and internal world, it requires the investigation of
why people thought they were in the right place, or had the right assessment of the situation around them.
What made that so? The adequacy or accuracy of an insider's representation of the situation cannot be called
into question: It is what counts for him or her, and it is that which drives further action in that situation. The
internal, subjective world is the only one that exists. If there is an objective, external reality, we could not know
it.
Getting Lost at the Airport
Let us now turn to a simple case, to see how these things play out. Runway incursions (aircraft taxiing onto
runways for which they did not have a clearance) are an acute category of such cases in transportation today.
Runway incursions are seen as a serious and growing safety problem worldwide, especially at large, controlled
airports (where air-traffic control organizes and directs traffic movements). Hundreds of incursions occur every
year, some leading to fatal accidents. Apart from straying onto a runway without a clearance, the risk of
colliding with something else at an airport is considerable. Airports are tight and dynamic concentrations of
cars, buses, carts, people, trucks, trucks plus aircraft, and aircraft, all moving at speeds varying from a few
knots to hundreds of knots. (And then fog can settle over all of that). The number of things to hit is much larger
on the ground than it is in the air, and the proximity to those things is much closer. And because of the layout
of taxiways and ramps, navigating an aircraft across an airport can be considerably more difficult than
navigating it in flight. When runway incursions occur, it can be tempting to blame a loss of situation
awareness. Here is one such case, not a runway incursion, but a taxiway incursion. This case is illustrative not
only because it is relatively simple, but also because all regulations had been followed in the design and layout
of the airport. Safety cases had been conducted for the airport, and it had been certified as compliant with all
relevant rules. Incidents in such an otherwise safe system can happen even when everybody follows the rules.
+
This incident happened at Stockholm Arlanda (the international airport) in October 2002. A Boeing 737 had
landed on runway 26 (the opposite of runway 08, which can be seen in Fig. 5.3) and was directed by air traffic
control to taxi to its gate via taxiways ZN and Z (called "Zulu November" and "Zulu," in aviation speak). The
reason for taking ZN was that a tow truck with an aircraft was coming from the other direction. It had been
cleared to use ZP (Zulu Papa) and then to turn right onto taxiway X (X-ray). But the 737 did not take ZN. To
the horror of the tow truck driver, it carried on following ZP instead, almost straight into the tow truck. The
pilots saw the truck in time, however, and managed to stop. The truck driver had to push his aircraft backward
in order to clear up the jam. Did the pilots lose situation awareness? Was their situation awareness deficient?
There were signs pointing out where taxiway ZN ran, and those could be seen from the cockpit. Why did the
crew not take these cues into account when coming off the runway?
Such questions consistently pull us toward the position of retrospective outsider, looking down onto the
developing situation from a God's-eye point of view. From there we can see the mismatch grow between
where people were and where they thought they were. From there we can easily draw the circles of the Venn
diagram, pointing out a deficiency or a shortcoming in the awareness of the people in question. But none of
that explains much. The mystery of the matter-mind problem is not going to go away just because we say that
other people did not see what we now know they should have seen. The challenge is to try to understand why
the crew of the 737 thought that they were right—that they were doing exactly what air-traffic control had told
them to do: follow taxiway ZN to Z.
The commitment of an antidualist position is to try to see the world through the eyes of the protagonists, as
there is no other valid perspective. The challenge with navigational incidents, indeed, is not to point out that
people were not in the spot they thought they were, but to explain why they thought they were right. The
challenge is to begin to understand on the basis of what cues people (thought) they knew where they were.
The first clue can be found in the response of the 737 crew after they had been reminded by the tower to follow
ZN (they had now stopped, facing the tow truck head-on). 'Yeah, it's the chart here that's a little strange," said
one of the pilots (Statens Haverikommision, 2003, p. 8). If there was a mismatch, it was not between the actual
world and the crew's model of that world. Rather, there was a mismatch between the chart in the cockpit and
the actual airport layout. As can be seen in Fig. 5.3, the taxiway layout contained a little island, or roundabout,
between taxiways Zulu and X-ray. ZN and ZP were the little bits going between X-ray to Zulu, around the
roundabout. But the chart available in the cockpit had no little island on it. It showed no roundabout (see Fig.
5.4).
Even here, no rules had been broken. The airport layout had recently changed (with the addition of the little
roundabout) in connection with the construction of a new terminal pier. It takes time for the various charts
to be updated, and this simply had not happened yet at the company of the crew in question. Still, how could
the crew of the 737 have ended up on the wrong side of that area (ZP instead of ZN), whether there was an
island shown on their charts or not? Figure 5.5 contains more clues. It shows the roundabout from the height of
a car (which is lower than a 737 cockpit, but from the same direction as an aircraft coming off of runway 26).
The crew in question went left of the roundabout, where it should have gone right. The roundabout is covered
in snow, which makes it inseparable from the other (real) islands separating taxiways Zulu and X-ray. These
other islands consist of grass, whereas the roundabout, with a diameter of about 20 meters, is the same
tarmac as that of the taxiways. Shuffle snow onto all of them, however, and they look indistinguishable. The
roundabout is no FIG. 5.4. The chart available in the Boeing 737 cockpit at the time. It shows no little
roundabout, or island, separating taxiways ZN from ZP (Statens Haverikommision, 2003).
FIG. 5.5. The roundabout as seen coming off runway 26. The Boeing 737 went left around the roundabout,
instead of right (Statens Haverikommision, 2003). longer a circle painted on the tarmac: It is an island like all
others. Without a roundabout on the cockpit chart, there is only one plausible explanation for what the island in
Fig. 5.5 ahead of the aircraft is: It must be the grassy island to the right of taxiway ZN. In other words, the crew
knew where they were, based on the cues and indications available to them, and based on what these cues
plausibly added up to. The signage, even though it breaks no rules either, does not help. Taxiway signs are
among the most confusing directors in the world of aviation, and they are terribly hard to turn into a reasonable
representation of the taxiway system they are supposed to help people navigate on.
The sign visible from the direction of the Boeing 737 is enlarged in Fig. 5.6. The black part of the sign is the
position part (this indicates what taxiway it is), and the yellow part is the direction part: This taxiway (ZN) will
lead to taxiway Z, which happens to run at about a right angle across ZN. These signs are placed to the left of
the taxiway they belong to. In other words, the ZN taxiway is on the right of the sign, not on the left. But put the
sign in the context of Fig. 5.5, and things become more ambiguous. The black ZN part is now leaning toward
the left side of the roundabout, not the right side. Yet the ZN part belongs to the piece of tarmac on the right of
the roundabout.
FIG. 5.6. The sign on the roundabout that is visible for aircraft coming off runway 26 (Statens
Haverikommision, 2003).
The crew never saw as such. For them, given their chart, the roundabout was the island to the right of ZN.
Why not swap the black ZN and the yellow Z parts? Current rules for taxiway signage will not allow it (rules can
indeed stifle innovation and investments in safety). And not all airports comply to this religiously either.
There may be exceptions when there is no room or when visibility of the sign would be obstructed if placed on
the left side. To make things worse, regulations state that taxiway signs leading to a runway need to be placed
on both sides of the taxiway. In those cases, the black parts of the signs are often actually adjacent to the
taxiway, and not removed from it, as in Fig. 5.5. Against the background of such ambiguity, few pilots actually
know or remember that taxiway signs are supposed to be on the left side of the taxiway they belong to. In fact,
very little time in pilot training is used to get pilots to learn to navigate around airports, if any. It is a peripheral
activity, a small portion of mundane, pedestrian work that merely leads up to and concludes the real work:
flying from A to B. When rolling off the runway, and going to taxiway Zulu and then to their gate, this crew knew
where they were.
Their indications (cockpit chart, snow-covered island, taxiway sign) compiled into a plausible story: ZN, their
assigned route, was the one to the left of the island, and that was the one they were going to take. Until they
ran into a tow truck, that is. But nobody in this case lost situation awareness. The pilots lost nothing. Based on
the combination of cues and indications observable by them at the time, they had a plausible story of where
they were. Even if a mismatch can be shown between how the pilots saw their situation and how retrospective,
outside observers now see that situation, this has no bearing on understanding how the pilots made sense of
their world at the time.
Seeing situation awareness as a measure of the accuracy of correspondence between some outer world and
an inner representation carries with it a number of irresolvable problems that have always been connected to
such a dualist position. Taking the mind-matter problem apart by separating the two means that the theory
needs to connect the two again. Theories of situation awareness typically rely on a combination of two schools
in psychological thought to reconstitute this tie, to make this bridge. One is empiricism, a traditional school in
psychology that makes claims on how knowledge is chiefly, if not uniquely, based on experience. The second
is the information-processing school in cognitive psychology, still popular in large areas of human factors.
None of these systems of thought, however, are particularly successful in solving the really hard questions
about situation awareness, and may in fact be misleading in certain respects. We look at both of them in turn
here. Once that is done, the chapter briefly develops the counterposition on the mind-matter question: an
antidualist one (as related to situation awareness). This position is worked out further in the rest of this chapter,
using the Royal Majesty case as an example.
EMPIRICISM AND THE PERCEPTION OF ELEMENTS
Most theories of situation awareness actually leave the processes by which matter makes it into mind to the
imagination. A common denominator, however, appears to be the perception of elements in the environment.
Elements are the starting point of perceptual and meaning-making processes. It is on the basis of these
elements that we gradually build up an understanding of the situation, by processing such elementary stimulus
information through multiple stages of consciousness or awareness ("levels of SA"). Theories of situation
awareness borrow from empiricism (particularly British empiricism), which assumes that the organized
character and the meaningfulness of our perceptual world are achieved by matching incoming stimuli with prior
experience through a process called association. In other words, the world as we experience it is disjointed
(consisting of elements) except when mediated by previously stored knowledge. Correspondence between
mind and matter is made by linking incoming impressions through earlier associations.
Empiricism in its pure form is nothing more than saying that the major source of knowledge is experience, that
we do not know about the world except through making contact with that world with our sensory organs.
Among Greek philosophers around the 5th century B.C., empiricism was accepted as a guide to epistemology,
as a way of understanding the origin of knowledge. Questions already arose, however, on whether all psychic
life could be reduced to sensations. Did the mind have a role to play at all in turning perceptual impressions
into meaningful percepts? The studies of perception by Johannes Kepler (1571-1630) would come to suggest
that the mind had a major role, even though he himself left the implications of his findings up to other
theoreticians. Studying the eyeball, Kepler found that it actually projects an inverted image on the retina at the
back. Descartes, himself dissecting the eye of a bull to see what image it would produce, saw the same thing.
If the eye inverts the world, how can we see it the right way up? There was no choice but to appeal to mental
processing.
Not only is the image inverted, it is also two-dimensional, and it is cast onto the backs of two eyes, not one.
How does all that get reconciled in a single coherent, upright percept? The experiments boosted the notion of
impoverished, meaning-deprived stimuli entering our sensory organs, in need of some serious mental
processing work from there on. Further credence to the perception of elements was given by the 19th-century
discovery of photoreceptors in the human eye. This mosaic of retinal receptors appeared to chunk up any
visual percept coming into the eyeball. The resulting fragmented neural signals had to be sent up the
perceptual pathway in the brain for further processing and scene restoration.
British empiricists such as John Locke (1632-1704) and George Berkeley (1685-1753), though not privy to
19th-century findings, were confronted with the same epistemological problems that their Greek predecessors
had struggled with. Rather than knowledge being innate, or the chief result of reasoning (as claimed by
rationalists of that time), what role did experience have in creating knowledge? Berkeley, for example, wrestled
with the problem of depth perception (not a negligible problem when it comes to situation awareness). How do
we know where we are in space, in relation to objects around us? Distance perception, to Berkeley, though
created through experience, was itself not an immediate experience. Rather, distance and depth are additional
aspects of visual data that we learn about through combinations of visual, auditory, and tactile experiences.
We understand distance and depth in current scenes by associating incoming visual data with these earlier
experiences. Berkeley reduced the problem of space perception to more primitive psychological experiences,
decomposing the perception of distance and magnitude into constituent perceptual elements and processes
(e.g., lenticular accommodation, blurring of focus) . Such deconstruction of complex, intertwined psychological
processes into elementary stimuli turned out to be a useful tactic. It encouraged many after him, among them
Wilhelm Wundt and latter-day situation-awareness theorists, to analyze other experiences in terms of elements
as well. Interestingly, neither all prehistoric empiricists nor all British empiricists could be called dualists in the
same way that situation-awareness theorists can be. Protagoras, a contemporary of Plato around 430 B.C.,
had already said that "man is the measure of all things."
An individual's perception is true to him, and cannot be proven untrue (or inferior or superior) by some other
individual. Today's theories of situation awareness, with their emphasis on the accuracy of the mapping
between matter and mind, are very much into inferiority and superiority (deficient SA vs. good SA) as
something that can be objectively judged. This would not have worked for some of the British empiricists
either.
To Berkeley, who disagreed with earlier characterizations of an inner versus an outer world, people can
actually never know anything but their experiences. The world is a plausible but unproved hypothesis. In fact, it
is a fundamentally untestable hypothesis, because we can only know our own experience. Like Protagoras
before him, Berkeley would not have put much stock in claims of the possibility of superior or ideal situation
awareness, as such a thing is logically impossible. There are no superlatives when it comes to knowledge
through experience. For Berkeley too, this meant that even if there is an objective world out there (the large
circle in the Venn diagram), we could never know it. It also meant that any characterization of such an
objective world with the aim of understanding somebody's perception, somebody's situation awareness, would
have been nonsense.
Experiments, Empiricism, and Situation Awareness
Wilhelm Wundt is credited with founding the first psychological laboratory in the world at the University of
Leipzig in the late 1870s. The aim of his laboratory was to study mental functioning by deconstructing it into
separate elements. These could then be combined to understand perceptions, ideas, and other associations.
Wundt's argument was simple and compelling, and versions of it are still used in psychological method
debates today. Although the empirical method had been developing all around psychology, it was still occupied
with grand questions of consciousness, soul, and destiny, and it tried to gain access to these issues through
introspection and rationalism. Wundt argued that these were questions that should perhaps be asked at
the logical end point of psychology, but not at the beginning. Psychology should learn to crawl before trying to
walk. This justified the appeal of the elementarist approach: chopping the mind and its stimuli up into minute
components, and studying them one by one. But how to study them?
Centuries before, Descartes had argued that mind and matter not only were entirely separate, but should be
studied using different methods as well. Matter should be investigated using methods from natural science
(i.e., the experiment), whereas mind should be examined through processes of meditation, or introspection.
Wundt did both. In fact, he combined the natural science tradition with the introspective one, molding them into
a novel brand of psychological experimentation that still governs much of human factors research to this day.
Relying on complicated sets of stimuli, Wundt investigated sensation and perception, attention, feeling, and
association.
Using intricate measurements of reaction times, the Leipzig laboratory hoped they would one day be able to
achieve a chronometry of mind (which was not long thereafter dismissed as infeasible).
Rather than just counting on quantitative experimental outcomes, Wundt asked his subjects to engage in
introspection, to reflect on what had happened inside their minds during the trials. Wundt's introspection was
significantly more evolved and demanding than the experimental "report" psychologists ask their subjects for
today. Introspection was a skill that required serious preparation and expertise, because the criteria for gaining
successful access to the elementary makeup of mind were set very high. As a result, Wundt mostly used his
assistants. Realizing that the contents of awareness are in constant flux, Wundt produced rigid rules for the
proper application of introspection: (a) The observer, if at all possible, must be in a position to determine when
the process is to be introduced; (b) he or she must be in a state of "strained attention"; (c) the observation must
be capable of being repeated several times; (d) the conditions of the experiment must be such that they are
capable of variation through introduction or elimination of certain stimuli and through variation of the strength
and quality of the stimuli.
Wundt thus imposed experimental rigor and control on introspection. Similar introspective rigor, though
different in some details and prescriptions, is applied in various methods for studying situation awareness
today. Some techniques involve blanking or freezing of displays, with researchers then going in to elicit what
participants remember about the scene. This requires active introspection. Wundt would have been fascinated,
and he probably would have had a thing or two to say about the experimental protocol. If subjects are not
allowed to say when the blanking or freezing is to be introduced, for example (Wundt's first rule), how does that
compromise their ability to introspect? In fact, the blanking of displays and handing out a situation-awareness
questionnaire is more akin to the Wurzburg school of experimental psychology that started to compete with
Wundt in the late 19th century. The Wurzburgers pursued "systematic experimental introspection" by having
subjects pursue complex tasks that involved thinking, judging, and remembering. They would then have their
subjects render a retrospective report of their experiences during the original operation. The whole experience
had to be described time period by time period, thus chunking it up. In contrast to Wundt, and like
situationawareness research participants today, Wurzburg subjects did not know in advance what they were
going to have to introspect. Others today disagree with Descartes' original exhortation and remain
fearful of the subjectivist nature of introspection. They favor the use of clever scenarios in which the outcome,
or behavioral performance of people, will reveal what they understood the situation to be. This is claimed to
be more of a natural science approach that stays away from the need to introspect. It relies on objective
performance indicators instead. Such an approach to studying situation awareness could be construed as
neobehaviorist, as it equates the study of behavior with the study of consciousness.
Mental states are not themselves the object of investigation: Performance is. If desired, such performance can
faintly hint at the contents of mind (situation awareness). But that itself is not the aim; it cannot be, because
through such pursuits psychology (and human factors) would descend into subjectivism and ridicule. Watson,
the great proponent of behaviorism, would himself have argued along these lines.
Additional arguments in favor of performance-oriented approaches include the assertation that introspection
cannot possibly test the contents of awareness, as it necessarily appeals to a situation or stimulus from the
past. The situation on which people are asked to reflect has already disappeared. Introspection thus simply
probes people's memory. Indeed, if you want to study situation awareness, how can you take away the
"situation" by blanking or freezing their world, and still hope you have relevant "awareness" left to investigate
by introspection? Wundt, as well as many of today's situation awareness researchers, may in part have been
studying memory, rather than the contents of consciousness.
Wundt was, and still remains, one of the chief representatives of the elementarist orientation, pioneered by
Berkeley centuries before, and perpetuated in modern theories of situation awareness. But if we perceive
elements, if the eyeball deals in two-dimensional, fragmented, inverted, meaning-deprived stimuli, then how
does order in our perceptual experience come about? What theory can account for our ability to see coherent
scenes, objects? The empiricist answer of association is one way of achieving such order, of creating such
interelementary connections and meaning. Order is an end product, it is the output of mental or cognitive work.
This is also the essence of information processing, the school of thought in cognitive psychology that
accompanied and all but colonized human factors since its inception during the closing days of the Second
World War. Meaning and perceptual order are the end result of an internal trade in representations,
representations that get increasingly filled out and meaningful as a result of processing in the mind.
INFORMATION PROCESSING
Information processing did not follow neatly on empiricism, nor did it accompany the initial surge in
psychological experimentation. Wundt's introspection did not immediately fuel the development of theoretical
substance to fill the gap between elementary matter and the mind's perception of it. Rather, it triggered an anti-
subjectivist response that would ban the study of mind and mental processes for decades to come, especially
in North America. John Watson, a young psychologist, introduced psychology to the idea of behaviorism in
1913, and aimed to conquer psychology as a purely objective branch of natural science. Introspection was to
be disqualified, and any references to or investigations of consciousness were proscribed.
The introspective method was seen as unreliable and unscientific, and psychologists had to turn their focus
exclusively to phenomena that could be registered and described objectively by independent observers.
This meant that introspection had to be replaced by tightly controlled experiments that varied subtle
combinations of rewards and punishments in order to bait organisms (anything from mice to pigeons to
humans) into particular behaviors over others. The outcome of such experiments was there for all to see, with
no need for introspection. Behaviorism became an early 20th-century embodiment of the Baconian ideal of
universal control, this time reflected in a late Industrial Revolution obsession with manipulative technology and
domination. It appealed enormously to an optimistic, pragmatic, rapidly developing, and result-oriented North
America. Laws extracted from simple experimental settings were thought to carry over to more complex
settings and to more experiential phenomena as well, including imagery, thinking, and emotions.
Behaviorism was thus fundamentally nomothetic: deriving general laws thought to be applicable across people
and settings. All human expressions, including art and religion, were reduced to no more than conditioned
responses. Behaviorism turned psychology into something wonderfully Newtonian: a schedule of stimuli and
responses, of mechanistic, predictable, and changeable couplings between inputs and outputs. The only
legitimate characterization of psychology and mental life was one that conformed to the Newtonian
framework of classical physics, and abided by its laws of action and reaction. Then came the Second World
War, and the behaviorist bubble was deflated. No matter how clever a system of rewards and punishments
psychologists set up, radar operators monitoring for German aircraft intruding into Britain across the Channel
would still lose their vigilance over time. They would still have difficulty distinguishing signals from noise,
independent of the possible penalties. Pilots would get controls mixed up, and radio operators were evidently
limited in their ability to hold information in their heads while getting ready for the next transmission. Where
was behaviorism? It could not answer to the new pragmatic appeals.
Thus came the first cognitive revolution.
The cognitive revolution reintroduced mind as a legitimate object of study. Rather than manipulating the effect
of stimuli on overt responses, it concerned itself with "meaning" as the central concept of psychology. Its aim
was, as Bruner (1990) recalled, to discover and describe meanings that people created out of their encounters
with the world, and then to propose hypotheses for what meaning-making processes were involved. The very
metaphors, however, that legitimized the reintroduction of the study of mind also began to immediately corrupt
it. The first cognitive revolution fragmented and became overly technicalized.
The radio and the computer, two technologies accelerated by developments during the Second World War,
quickly captured the imagination of those once again studying mental processes. These were formidable
similes of mind, able to mechanistically fill the black box (which behaviorism had kept shut) between stimulus
and response. The innards of a radio showed filters, channels, and limited capacities through which informa-
tion flowed. Not much later, all those words appeared in cognitive psychology.
Now the mind had filters, channels, and limited capacities too.
The computer was even better, containing a working memory, a long-term memory, various forms of storage,
input and output, and decision modules. It did not take long for these terms, too, to appear in the psychological
lexicon. What seemed to matter most was the ability to quantify and compute mental functioning. Information
theory, for example, could explain how elementary stimuli (bits) would flow through processing channels
to produce responses. A processed stimulus was deemed informative if it reduced alternative choices, whether
the stimulus had to do with Faust or with a digit from a statistical table.
As Bruner (1990) recollected, computability became the necessary and sufficient criterion for cognitive
theories. Mind was equated to program. Through these metaphors, the "construction of meaning" quickly
became the "processing of information." Newton and Descartes simply would not let go. Once again,
psychology was reduced to mechanical components and linkages, and exchanges of energy between and
along them. Testing the various components (sensory store, memory, decision making) in endless series of
fractionalized laboratory experiments, psychologists hoped, and many are still hoping, that more of the same
will eventually add up to something different, that profound insight into the workings of the whole will
magically emerge from the study of constituent components.
The Mechanization of Mind
Information processing has been a profoundly Newtonian-Cartesian answer to the mind-matter problem. It is
the ultimate mechanization of mind. The basic idea is that the human (mind) is an information processor that
takes in stimuli from the outside world, and gradually makes sense of those stimuli by combining them with
things already stored in the mind.
For example, I see the features of a face, but through coupling them to what I have in long-term memory, I
recognize the face as that of my youngest son. Information processing is loyal to the biological-psychological
model that sees the matter-mind connection as a physiologically identifiable flow of neuronal energy from
periphery to center (from eyeball to cortex), along various nerve pathways. The information-processing
pathway of typical models mimics this flow, by taking a stimulus and pushing it through various stages of
processing, adding more meaning the entire time. Once the processing system has understood what the
stimulus means (or stimuli mean) can an appropriate response be generated (through a backflow from center
to periphery, brain to limbs) that in turn creates more (new) stimuli to process.
The Newtonian as well as dualist intimations of the information-processing model are a heartening sight for
those with Cartesian anxiety. Thanks to the biological model underneath it, the mind-matter problem is one of a
Newtonian transfer (conversion as well as conservation) of energy at all kinds of levels (from photonic energy
to nerve impulses, from chemical releases to electrical stimulation, from stimulus to response at the overall
organismic level). Both Descartes and Newton can be recognized in the componential explanation of mental
functioning (memory, e.g., is typically parsed up in iconic memory, short-term memory, and long-term
memory): End-products of mental processing can be exhaustively explained on the basis of interactions
between these and other components. Finally of course, information processing is mentalist: It neatly
separates res cogitans from resextensa by studying what happens in the mind entirely separately from what
happens in the world. The world is a mere adjunct, truly a res extensa, employed solely to lob the next stimulus
at the mind (which is where the really interesting processes take place).
The information-processing model works marvelously for the simple laboratory experiments that brought it to
life. Laboratory studies of perception, decision making, and reaction time reduce stimuli to single snapshots,
fired off at the human processing mechanism as one-stop triggers. The Wundtian idea of awareness as a
continually flowing phenomenon is artificially reduced, chunked up and frozen by the very stimuli that subjects
are to become aware of. Such dehumanization of the settings in which perception takes place, as well as of
the models by which such perception comes about, has given rise to considerable critique. If people are seen
to be adding meaning to impoverished, elementary stimuli, then this is because they are given impoverished,
elementary, meaningless stimuli in their laboratory tasks! None of that says anything about natural perception
or the processes by which people perceive or construct meaning in actual settings.
The information-processing model may be true (though even that is judged as unlikely by most), but only for
the constrained, Spartan laboratory settings that keep cognition in captivity. If people are seen to struggle in
their interpretation of elements, then this may have something to do with the elementary stimuli given to them.
Even Wundt was not without detractors in this respect. The Gestalt movement was launched in part as a
response or protest to Wundtian elementarism. Gestaltists claimed that we actually perceive meaningful
wholes, that we immediately experience those wholes. We cannot help but see these patterns, these wholes.
Max Wertheimer (1880-1934), one of the founding fathers of Gestaltism, illustrates this as such: "I am standing
at the window and see a house, trees, sky. And now, for theoretical purposes, I could try to count and say:
there are . . . 327 nuances of brightness [and hue]. Do I see '327'? No; I see sky, house, trees" (Wertheimer,
cited in Woods et al., 2002, p. 28). The gestalts that Wertheimer sees (house, trees, sky) are primary to their
parts (their elements), and they are more than the sum of their parts.
There is an immediate orderliness in experiencing the world. Wertheimer inverts the empiricist claim and
information-processing assumption: Rather than meaning being the result of mental operations on elementary
stimuli, it actually takes painstaking mental effort (counting 327 nuances of brightness and hue) to reduce
primary sensations to their primitive elements. We do not perceive elements: We perceive meaning.
Meaning comes effortlessly, prerationally. In contrast, it takes cognitive work to see elements. In the words of
William James' senior Harvard colleague Chauncey Wright, there is no antecedent chaos that requires some
intrapsychic glue to prevent percepts from falling apart. New Horizons (Again)
Empiricism does not recognize the immediate orderliness of experience because it does not see relations as
real aspects of immediate experience (Heft, 2001). Relations, according to empiricists, are a product of mental
(information) processing. This is true for theories of situation awareness. For them, relations between elements
are mental artifacts. They get imposed through stages of processing. Subsequent levels of SA add
relationships to elements by linking those elements to current meanings and future projections. The problem of
the relationship between matter and mind is not at all solved through empiricist responses. But perhaps
engineers and designers, as well as many experimental psychologists, are happy to hear about elements (or
327 nuances of brightness and hue), for those can be manipulated in a design prototype and experimentally
tested on subjects.
Wundt would have done the same thing. Not unlike Wundt 100 years before him, Ulrich Neisser warned in
1976 that psychology was not quite ready for grand questions about consciousness. Neisser feared that
models of cognition would treat consciousness as if it were just a particular stage of processing in a
mechanical flow of information. His fears were justified in the mid-1970s, as many psychological models
did exactly that. Now they have done it again. Awareness, or consciousness, is equated to a stage of
processing along an intrapsychic pathway (levels of SA). As Neisser pointed out, this is an old idea in
psychology. The three levels of SA in vogue today were anticipated by Freud, who even supplied flowcharts
and boxes in his Interpretation of Dreams to map the movements from unconscious (level 1) to preconscious
(level 2) to conscious (level 3). The popularity of finding a home, a place, a structure for consciousness
in the head is irresistible, said Neisser, as it allows psychology to nail down its most elusive target
(consciousness) to a box in a flowchart.
There is a huge cost, though. Along with the deconstruction and mechanization of mental phenomena comes
their dehuminization. Information-processing theories have lost much of their appeal and credibility. Many
realize how they have corrupted the spirit of the postbehaviorist cognitive revolution by losing sight of humanity
and meaning making. Empiricism (or British empiricism) has slipped into history as a school of thought at the
beginning of psychological theorizing. Yet both form legitimate offshoots in current understandings of situation
awareness.
Notions similar to those of empiricism and information processing are reinvented under new guises, which
reintroduces the same type of foundational problems, while leaving some of the really hard problems
unaddressed.
The problem of the nature of stimuli is one of those, and associated with it is the problem of meaning making.
How does the mind make sense of those stimuli? Is meaning the end-product of a processing pathway that
flows from periphery to center? These are enormous problems in the history of psychology, all of them
problems of the relationship between mind and matter, and all essentially still unresolved. Perhaps they are
fundamentally unsolvable within the dualist tradition that psychology inherited from Descartes.
Some movements in human factors are pulling away from the experimental psychological domination. The
idea of distributed cognition has renewed the status of the environment as active, constituent participant in
cognitive processes, closing the gap between res cogitans and res extensa. Other people, artifacts, and even
body parts are part of the res cogitans. How is it otherwise that a child learns to count on his hand, or a soccer
player thinks with her feet?
Concomitant interest in cognitive work analysis and cognitive systems engineering see such joint cognitive
systems as units of analysis, not the constituent human or machine components. Qualitative methods such as
ethnography are increasingly legitimate, and critical for understanding distributed cognitive systems. These
movements have triggered and embodied what has now become known as the second cognitive revolution,
recapturing and rehabilitating the impulses that brought to life the first. How do people make meaning? In order
to begin to answer such aboriginal questions, it is now increasingly held as justifiable and necessary to throw
the human factors net wider than experimental psychology. Other forms of social inquiry can shed more light
on how we are goal-driven creatures in actual, dynamic environments, not passive recipients of snapshot
stimuli in a sterile laboratory.
The concerns of these thinkers overlap with functionalist approaches, which formed yet another psychology of
protest against Wundt's elementarism. The same protest works equally well against the mechanization of
mind by the information-processing school. A century ago, functionalists (William James was one of their
foremost exponents) pointed out how people are integrated, living organisms engaged in goal-directed
activities, not passive element processors locked into laboratory headrests, buffeted about by one-shot stimuli
from an experimental apparatus. The environment in which real activities play out helps shape the organism's
responses.
Psychological functioning is adaptive: It helps the organism survive and adjust, by incrementally modifying and
tweaking its composition or its behavior to generate greater gains on whatever dimension is relevant.
Such ecological thinking is now even beginning to seep into approaches to system safety, which has so far
also been dominated by mechanistic, structuralist models (see chap. 2). James was not just a functionalist,
however. In fact, he was one of the most all-round psychologists ever. His views on radical empiricism are one
great way to access novel thinking about situation awareness and sensemaking, and only appropriate against
a background of increasing interest in the role of ecological psychology in human factors.
RADICAL EMPIRICISM
Radical empiricism is one way of circumventing the insurmountable problems associated with psychologies
based on dualistic traditions, and William James introduced it as such at the beginning of the 20th century.
Radical empiricism rejects the notion of separate mental and material worlds; it rejects dualism. James
adhered to an empiricist philosophy, which holds that our knowledge comes (largely) from our discoveries, our
experience. But, as Heft (2001) pointed out, James' philosophy is radically empiricist. What is experienced,
according to James, is not elements, but relations—meaningful relations. Experienced relations are what
perception is made up of. Such a position can account for the orderliness of experience, as it does not rely on
subsequent, or a posteriori mental processing. Orderliness is an aspect of the ecology, of our world as we
experience it and act in it. The world as an ordered, structured universe is experienced, not constructed
through mental work. James dealt with the matter-mind problem by letting the knower and the known coincide
during the moment of perception (which itself is a constant, uninterrupted flow, rather than a moment).
Ontologies (our being in the world) are characterized by continual transactions between knower and known.
Order is not imposed on experience, but is itself experienced.
Variations of this approach have always represented a popular countermove in the history of psychology of
consciousness. Rather than containing consciousness in a box in the head, it is seen as an aspect of activity.
Weick (1995) used the term "enactment" to indicate how people produce the environment they face and are
aware of. By acting in the world, people continually create environments that in turn constrain their
interpretations, and consequently constrain their next possible actions. This cyclical, ongoing nature of
cognition and sensemaking has been recognized by many (see Neisser, 1976) and challenges common
interpretations rooted in information processing psychology where stimuli precede meaning making and
(only then) action, and where frozen snapshots of environmental status can be taken as legitimate input to the
human processing system. Instead, activities of individuals are only partially triggered by stimuli, because the
stimulus itself is produced by activity of the individual.
This moved Weick (1995) to comment that sensemaking never starts; that people are always in the middle of
things. Although we may look back on our own experience as consisting of discrete events, the only way to get
this impression is to step out of that stream of experience and look down on it from a position of outsider, or
retrospective outsider. It is only possible, really, to pay direct attention to what already exists (that which has
already passed): "Whatever is now, at the present moment, under way will determine the meaning of whatever
has just occurred" (Weick, p. 27). Situation awareness is in part about constructing a plausible story of the
process by which an outcome came about, and the reconstruction of immediate history probably plays a
dominant role in this. Few theories of situation awareness acknowledge this role, actually, instead directing
their analytics to the creation of meaning from elements and the future projection of that meaning.
Radical empiricism does not take the stimulus as its starting point, as does information processing, and neither
does it need a-flosteriori processes (mental, representational) to impose orderliness on sensory impressions.
We already experience orderliness and relationships through ongoing, goal-oriented transactions of acting and
perceiving. Indeed what we experience during perception is not some cognitive end product in the head.
Neisser reminded us of this longstanding issue in 1976 too: Can it be true that we see our own retinal images?
The theoretical distance that then needs to be bridged is too large. For if we see that retinal image, who does
the looking? Homunculus explanations were unavoidable (and often still are in information processing).
Homunculi do not solve the problem of awareness, they simply relocate it.
Rather than a little man in our heads looking at what we are looking at, we ourselves are aware of the world,
and its structure, in the world. As Edwin Holt put it, "Consciousness, whenever it is localized at all in space, is
not in the skull, but is 'out there' precisely where it appears to be" (cited in Heft, 2001, p. 59). James, and the
entire ecological school after him, anticipated this. What is perceived, according to James, is not a replica, not
a simile of something out there. What is perceived is already out there. There are no intermediaries between
perceiver and percept; perceiving is direct. This position forms the groundwork of ecological approaches in
psychology and human factors. If there is no separation between matter and mind, then there is no gap
that needs bridging; there is no need for reconstructive processes in the mind that make sense of elementary
stimuli. The Venn diagram with a little and a larger circle that depict actual and ideal situation awareness is
superfluous too.
Radical empiricism allows human factors to stick closer to the anthropologist's ideal of describing and
capturing insider accounts. If there is no separation between mind and matter, between actual and ideal
situation awareness, then there is no risk of getting trapped in judging performance by use of extrogenous
criteria; criteria imported from outside the setting (informed by hindsight or some other source of omniscience
about the situation that opens up that delta, or gap, between what the observer inside the situation knew and
what the researcher knows). What the observer inside the situation knows must be seen as canonical it must
be understood not in relation to some normative ideal. For the radical empiricist, there would not be two circles
in the Venn diagram, but rather different rationalities, different understandings of the situation none of them
right or wrong or necessarily better or worse, but all of them coupled directly to the interests, expectations,
knowledge, and goals of the respective observer.
DRIFTING OFF TRACK: REVISITING A CASE OF "LOST SITUATION AWARENESS"
Let us go back to the Royal Majesty. Traditionalist ideas about a lack of correspondence between a material
and a mental world get a boost from this sort of case. A crew ended up 17 miles off track, after a day and a
half of sailing. How could this happen? As said before, hindsight makes it easy to see where people were,
versus where they thought they were. In hindsight, it is easy to point to the cues and indications that these
people should have picked up in order to update or correct or even form their understanding of the unfolding
situation around them. Hindsight has a way of exposing those elements that people missed, and a way of
amplifying or exaggerating their importance. The key question is not why people did not see what we now
know was important. The key question is how they made sense of the situation the way they did. What must
the crew in question at the time have seen? How could they, on the basis of their experiences, construct a
story that was coherent and plausible? What were the processes by which they became sure that they were
right about their position? Let us not question the accuracy of the insider view. Research into situation
awareness already does enough of that. Instead, let us try to understand why that insider view was plausible
for people at the time, why it was, in fact, the only possible view.
Departure From Bermuda
The Royal Majesty departed Bermuda, bound for Boston at 12:00 noon on the 9th of June 1995. The visibility
was good, the winds light, and the sea calm. Before departure the navigator checked the navigation and
communication equipment. He found it in "perfect operating condition." About half an hour after departure the
harbor pilot disembarked and the course was set toward Boston. Just before 13:00 there was a cutoff in the
signal from the GPS (Global Positioning System) antenna, routed on the fly bridge (the roof of the bridge), to
the receiver—leaving the receiver without satellite signals. Postaccident examination showed that the antenna
cable had separated from the antenna connection. When it lost satellite reception, the GPS promptly defaulted
to dead reckoning (DR) mode. It sounded a brief aural alarm and displayed two codes on its tiny display: DR
and SOL. These alarms and codes were not noticed. (DR means that the position is estimated, or deduced,
hence "ded," or now "dead," reckoning. SOL means that satellite positions cannot be calculated.) The ship's
autopilot would stay in DR mode for the remainder of the journey.
Why was there a DR mode in the GPS in the first place, and why was a default to that mode neither
remarkable, nor displayed in a more prominent way on the bridge? When this particular GPS receiver was
manufactured (during the 1980s), the GPS satellite system was not as reliable as it is today.
The receiver could, when satellite data was unreliable, temporarily use a DR mode in which it estimated
positions using an initial position, the gyrocompass for course input and a log for speed input. The GPS thus
had two modes, normal and DR. It switched autonomously between the two depending
on the accessibility of satellite signals.
By 1995, however, GPS satellite coverage was pretty much complete, and had been working well for years.
The crew did not expect anything out of the ordinary. The GPS antenna was moved in February, because
parts of the superstructure occasionally would block the incoming signals, which caused temporary and short
(a few minutes, according to the captain) periods of DR navigation. This was to a great extent remedied by the
antenna move, as the Cruise Line's electronics technician testified. People on the bridge had come to rely on
GPS position data and considered other systems to be backup systems. The only times the GPS positions
could not be counted on for accuracy were during these brief, normal episodes of signal blockage. Thus, the
whole bridge crew was aware of the DR-mode option and how it worked, but none of them ever imagined or
were prepared for a sustained loss of satellite data caused by a cable break—no previous loss of satellite data
had ever been so swift, so absolute, and so long lasting.
When the GPS switched from normal to DR on this journey in June 1995, an aural alarm sounded and a tiny
visual mode annunciation appeared on the display. The aural alarm sounded like that of a digital wristwatch
and was less than a second long. The time of the mode change was a busy time (shortly after departure), with
multiple tasks and distractors competing for the crew's attention. A departure involves complex maneuvering,
there are several crew members on the bridge, and there is a great deal of communication.
When a pilot disembarks, the operation is time constrained and risky. In such situations, the aural signal could
easily have been drowned out. No one was expecting a reversion to DR mode, and thus the visual indications
were not seen either. From the insider perspective, there was no alarm, as there was not going to be a mode
default. There was neither a history, nor an expectation of its occurrence.
Yet even if the initial alarm was missed, the mode indication was continuously available on the little GPS
display. None of the bridge crew saw it, according to their testimonies. If they had seen it, they knew what it
meant, literally translated dead reckoning means no satellite fixes. But as we saw before, there is a crucial
difference between data that in hindsight can be shown to have been available and data that were observable
at the time. The indications on the little display (DR and SOL) were placed between two rows of numbers
(representing the ship's latitude and longitude) and were about one sixth the size of those numbers. There was
no difference in the size and character of the position indications after the switch to DR. The size of the display
screen was about 7.5 by 9 centimeters, and the receiver was placed at the aft part of the bridge on a chart
table, behind a curtain. The location is reasonable, because it places the GPS, which supplies raw position
data, next to the chart, which is normally placed on the chart table. Only in combination with a chart do the
GPS data make sense, and furthermore the data were forwarded to the integrated bridge system and
displayed there (quite a bit more prominently) as well.
For the crew of the Royal Majesty, this meant that they would have to leave the forward console, actively look
at the display, and expect to see more than large digits representing the latitude and longitude. Even then,
if they had seen the two-letter code and translated it into the expected behavior of the ship, it is not a certainty
that the immediate conclusion would have been "this ship is not heading towards Boston anymore," because
temporary DR reversions in the past had never led to such dramatic departures from the planned route. When
the officers did leave the forward console to plot a position on the chart, they looked at the display and saw a
position, and nothing but a position, because that is what they were expecting to see. It is not a question of
them not attending to the indications. They were attending to the indications, the position indications, because
plotting the position it is the professional thing to do. For them, the mode change did not exist.
But if the mode change was so nonobservable on the GPS display, why was it not shown more clearly
somewhere else? How could one small failure have such an effect—were there no backup systems?
The Royal Majesty had a modern integrated bridge system, of which the main component was the
navigation and command system (NACOS). The NACOS consisted of two parts, an autopilot part to keep the
ship on course and a map construction part, where simple maps could be created and displayed on a radar
screen. When the Royal Majesty was being built, the NACOS and the GPS receiver were delivered by different
manufacturers, and they, in turn, used different versions of electronic communication standards.
Due to these differing standards and versions, valid position data and invalid DR data sent from the GPS to the
NACOS were both labeled with the same code (GP). The installers of the bridge equipment were not told, nor
did they expect, that (GP-labeled) position data sent to the NACOS would be anything but valid position data.
The designers of the NACOS expected that if invalid data were received, they would have another format. As a
result, the GPS used the same data label for valid and invalid data, and thus the autopilot could not distinguish
between them. Because the NACOS could not detect that the GPS data was invalid, the ship sailed on an
autopilot that was using estimated positions until a few minutes before the grounding.
A principal function of an integrated bridge system is to collect data such as depth, speed, and position from
different sensors, which are then shown on a centrally placed display to provide the officer of the watch with an
overview of most of the relevant information. The NACOS on the Royal Majesty was placed at the forward part
of the bridge, next to the radar screen. Current technological systems commonly have multiple levels of
automation with multiple mode indications on many displays. An adaptation of work strategy is to collect these
in the same place and another solution is to integrate data from many components into the same display
surface. This presents an integration problem for shipping in particular, where quite often components are
delivered by different manufacturers. The centrality of the forward console in an integrated bridge system also
sends the implicit message to the officer of the watch that navigation may have taken place at the chart table in
times past, but the work is now performed at the console. The chart should still be used, to be sure, but only as
a backup option and at regular intervals (customarily every half-hour or every hour). The forward console is
perceived to be a clearing house for all the information needed to safely navigate the ship.
As mentioned, the NACOS consisted of two main parts. The GPS sent position data (via the radar) to the
NACOS in order to keep the ship on track (autopilot part) and to position the maps on the radar screen (map
part). The autopilot part had a number of modes that could be manually selected: NAV and COURSE. NAV
mode kept the ship within a certain distance of a track, and corrected for drift caused by wind, sea, and
current. COURSE mode was similar but the drift was calculated in an alternative way. The NACOS also had a
DR mode, in which the position was continuously estimated. This backup calculation was performed in order to
compare the NACOS DR with the position received from the GPS. To calculate the NACOS DR position, data
from the gyrocompass and Doppler log were used, but the initial position was regularly updated with GPS data.
When the Royal Majesty left Bermuda, the navigation officer chose the NAV mode and the input came from the
GPS, normally selected by the crew during the 3 years the vessel had been in service.
If the ship had deviated from her course by more than a preset limit, or if the GPS position had differed from
the DR position calculated by the autopilot, the NACOS would have sounded an aural and clearly shown a
visual alarm at the forward console (the position-fix alarm). There were no alarms because the two DR
positions calculated by the NACOS and the GPS were identical. The NACOS DR, which was the perceived
backup, was using GPS data, believed to be valid, to refresh its DR position at regular intervals. This
is because the GPS was sending DR data, estimated from log and gyro data, but labeled as valid data. Thus,
the radar chart and the autopilot were using the same inaccurate position information and there was no display
or warning of the fact that DR positions (from the GPS) were used. Nowhere on the integrated display could
the officer on watch confirm what mode the GPS was in, and what effect the mode of the GPS was having on
the rest of the automated system, not to mention the ship.
In addition to this, there were no immediate and perceivable effects on the ship because the GPS calculated
positions using the log and the gyrocompass. It cannot be expected that a crew should become suspicious of
the fact that the ship actually is keeping her speed and course. The combination of a busy departure, an
unprecedented event (cable break) together with a nonevent (course keeping), and the change of the locus of
navigation (including the intrasystem communication difficulties) shows that it made sense, in the situation and
at the time, that the crew did not know that a mode change had occurred.
The Ocean Voyage
Even if the crew did not know about a mode change immediately after departure, there was still a long voyage
at sea ahead. Why did none of the officers check the GPS position against another source, such as the Loran-
C receiver that was placed close to the GPS? (Loran-C is a radio navigation system that relies on land-based
transmitters.) Until the very last minutes before the grounding, the ship did not act strangely and gave no
reason for suspecting that anything was amiss. It was a routine trip, the weather was good and the watches
and watch changes uneventful. Several of the officers actually did check the displays of both Loran and GPS
receivers, but only used the GPS data (because those had been more reliable in their experience) to plot
positions on the paper chart. It was virtually impossible to actually observe the implications of a difference
between Loran and GPS numbers alone. Moreover, there were other kinds of cross-checking. Every hour, the
position on the radar map was checked against the position on the paper chart, and cues in the world (e.g.,
sighting of the first buoy) were matched with GPS data. Another subtle reassurance to officers must have been
that the master on a number of occasions spent several minutes checking the position and progress of the
ship, and did not make any corrections. Before the GPS antenna was moved, the short spells of signal
degradation that lead to DR mode also caused the radar map to jump around on the radar screen (the crew
called it "chopping") because the position would change erratically. The reason chopping was not observed on
this particular occasion was that the position did not change erratically, but in a manner consistent with dead
reckoning. It is entirely possible that the satellite signal was lost before the autopilot was switched on, thus
causing no shift in position. The crew had developed a strategy to deal with this occurrence in the past.
When the position-fix alarm sounded, they first changed modes (from NAV to COURSE) on the autopilot and
then they acknowledged the alarm. This had the effect of stabilizing the map on the radar screen so that it
could be used until the GPS signal returned. It was an unreliable strategy, because the map was being used
without knowing the extent of error in its positioning on the screen. It also led to the belief that, as mentioned
earlier, the only time the GPS data were unreliable was during chopping.
Chopping was more or less alleviated by moving the antenna, which means that by eliminating one problem a
new pathway for accidents was created. The strategy of using the position-fix alarm as a safeguard no longer
covered all or most of the instances of GPS unreliability. This locally efficient procedure would almost certainly
not be found in any manuals, but gained legitimacy through successful repetition becoming common practice
over time. It may have sponsored the belief that a stable map is a good map, with the crew concentrating on
the visible signs instead of being wary of the errors hidden below the surface. The chopping problem had been
resolved for about 4 months, and trust in the automation grew.
First Buoy to Grounding
Looking at the unfolding sequence of events from the position of retrospective outsider, it is once again easy to
point to indications missed by the crew. Especially toward the end of the journey, there appears to be a larger
number of cues that could potentially have revealed the true nature of the situation. There was an inability of
the first officer to positively identify the first buoy that marked the entrance of the Boston sea lanes (such lanes
form a separation scheme delineated on the chart to keep meeting and crossing traffic at a safe distance and
to keep ships away from dangerous areas) . A position error was still not suspected, even with the vessel close
to the shore. The lookouts reported red lights and later blue and white water, but the second officer did not
take any action. Smaller ships in the area broadcasted warnings on the radio, but nobody on the bridge of the
Royal Majesty interpreted those to concern their vessel. The second officer failed to see the second buoy
along the sea lanes on the radar, but told the master that it had been sighted. In hindsight, there were
numerous opportunities to avoid the grounding, which the crew consistently failed to recognize
Such conclusions are based on a dualist interpretation of situation awareness. What matters to such an
interpretation is the accuracy of the mapping between an external world that can be pieced together in
hindsight (and that contains shopping bags full of epiphanies never opened by those who needed them most)
and people's internal representation of that world. This internal representation (or situation awareness) can be
shown to be clearly deficient, as falling far short of all the cues that were available. But making claims about
the awareness of other people at another time and place requires us to put ourselves in their shoes and limit
ourselves to what they knew. We have to find out why people thought they were in the right place, or had the
right assessment of the situation around them. What made that so? Remember, the adequacy or accuracy of
an insider's representation of the situation cannot be called into question: It is what counts for them, and it is
what drives further action in that situation. Why was it plausible for the crew to conclude that they were in the
right place? What did their world look like to them (not: How does it look now to retrospective observers)?
The first buoy ("BA") in the Boston traffic lanes was passed at 19:20 on the 10th of June, or so the chief officer
thought (the buoy identified by the first officer as the BA later turned out to be the "AR" buoy located about 15
miles to the west-southwest of the BA). To the chief officer, there was a buoy on the radar, and it was where
he expected it to be, it was where it should be. It made sense to the first officer to identify it as the correct buoy
because the echo on the radar screen coincided with the mark on the radar map that signified the BA. Radar
map and radar world matched. We now know that the overlap between radar map and radar return was a mere
stochastic fit. The map showed the BA buoy, and the radar showed a buoy return. A fascinating coincidence
was the sun glare on the ocean surface that made it impossible to visually identify the BA. But independent
cross-checking had already occurred: The first officer probably verified his position by two independent
means, the radar map and the buoy. The officer, however, was not alone in managing the situation, or in
making sense of it. An interesting aspect of automated navigation systems in real workplaces is that several
people typically use it, in partial overlap and consecutively, like the watch-keeping officers on a ship. At 20:00
the second officer took over the watch from the chief officer. The chief officer must have provided the vessel's
assumed position, as is good watch-keeping practice. The second officer had no reason to doubt that this was
a correct position. The chief officer had been at sea for 21 years, spending 30 of the last 36 months onboard
the Royal Majesty. Shortly after the takeover, the second officer reduced the radar scale from 12 to 6 nautical
miles. This is normal practice when vessels come closer to shore or other restricted waters.
By reducing the scale, there is less clutter from the shore, and an increased likelihood of seeing anomalies and
dangers. When the lookouts later reported lights, the second officer had no expectation that there was
anything wrong. To him, the vessel was safely in the traffic lane. Moreover, lookouts are liable to report
everything indiscriminately; it is always up to the officer of the watch to decide whether to take action.
There is also a cultural and hierarchical gradient between the officer and the lookouts; they come from different
nationalities and backgrounds. At this time, the master also visited the bridge and, just after he left, there was a
radio call. This escalation of work may well have distracted the second officer from considering the lookouts'
report, even if he had wanted to. After the accident investigation was concluded, it was discovered that two
Portuguese fishing vessels had been trying to call the Royal Majesty on the radio to warn her of the imminent
danger. The calls were made not long before the grounding, at which time the Royal Majesty was already 16.5
nautical miles from where the crew knew her to be. At 20:42, one of the fishing vessels called, "fishing vessel,
fishing vessel call cruise boat," on channel 16 (an international distress channel for emergencies only).
Immediately following this first call in English the two fishing vessels started talking to each other in
Portuguese. One of the fishing vessels tried to call again a little later, giving the position of the ship he was
calling. Calling on the radio without positively identifying the intended receiver can lead to mix-ups.
In this case, if the second officer heard the first English call and the ensuing conversation, he most likely
disregarded it since it seemed to be two other vessels talking to each other. Such an interpretation makes
sense: If one ship calls without identifying the intended receiver, and another ship responds and consequently
engages the first caller in conversation, the communication loop is closed. Also, as the officer was using the 6-
mile scale, he could not see the fishing vessels on his radar. If he had heard the second call and checked the
position, he might well have decided that the call was not for him, as it appeared that he was far from that
position. Whomever the fishing ships were calling, it could not have been him, because he was not there. At
about this time, the second buoy should have been seen and around 21:20 it should have been passed, but
was not. The second officer assumed that the radar map was correct when it showed that they were on
course. To him the buoy signified a position, a distance traveled in the traffic lane, and reporting that it had
been passed may have amounted to the same thing as reporting that they had passed the position it was
(supposed to have been) in. The second officer did not, at this time, experience an accumulation of
anomalies, warning him that something was going wrong. In his view, this buoy, which was perhaps missing or
not picked up by the radar, was the first anomaly, but not perceived as a significant one.
The typical Bridge Procedures Guide says that a master should be called when (a) something unexpected
happens, (b) when something expected does not happen (e.g., a buoy), and (c) at any other time of
uncertainty. This is easier to write than it is to apply in practice, particularly in a case where crew members do
not see what they expected to see. The NTSB report, in typical counterfactual style, lists at least five actions
that the officer should have taken. He did not take any of these actions, because he was not missing
opportunities to avoid the grounding. He was navigating the vessel safely to Boston. The master visited the
bridge just before the radio call, telephoned the bridge about 1 hour after it, and made a second visit around
22:00. The times at which he chose to visit the bridge were calm and uneventful, and did not prompt the
second officer to voice any concerns, nor did they trigger the master's interest in more closely examining the
apparendy safe handling of the ship. Five minutes before the grounding, a lookout reported blue and white
water. For the second officer, these indications alone were no reason for taking action. They were no warnings
of anything about to go amiss, because nothing was going to go amiss. The crew knew where they were.
Nothing in their situation suggested to them that they were not doing enough or that they should question the
accuracy of their awareness of the situation. At 22:20 the ship started to veer, which brought the captain to the
bridge. The second officer, still certain that they were in the traffic lane, believed that there was something
wrong with the steering. This interpretation would be consistent with his experiences of cues and indications
during the trip so far. The master, however, came to the bridge and saw the situation differently, but was too
late to correct the situation. The Royal Majesty ran aground east of Nantucket at 22:25, at which time she was
nautical miles from her planned and presumed course. None of the over 1,000 passengers were injured, but
repairs and lost revenues cost the company $7 million. With a discrepancy of 17 miles at the premature end to
the journey of the Royal Majesty, and a day and a half to discover the growing gap between actual and
intended track, the case of loss of SA, or deficient SA, looks like it is made. But the supposed elements that
make up all the cues and indications that the crew should have seen, and should have understood, are mostly
products of hindsight, products of our ability to look at the unfolding sequence of events from the position of
retrospective outsiders. In hindsight, we wonder how these repeated "opportunities to avoid the grounding,"
these repeated invitations to undergo some kind of epiphany about the real nature of the situation, were never
experienced by the people who needed them most. But the revelatory nature of the cues, as well as the
structure or coherence that they apparently have in retrospect, are not products of the situation
itself or the actors in it.
They are retrospective imports. When looked at from the position of retrospective outsider, the deficient SA
can look so very real, so compelling. They failed to notice, they did not know, they should have done this or
that. But from the point of view of people inside the situation, as well as potential other observers, these
deficiencies do not exist in and of themselves; they are artifacts of hindsight, elements removed retrospectively
from a stream of action and experience. To people on the inside, it is often nothing more than normal work. If
we want to begin to understand why it made sense for people to do what they did, we have to put ourselves in
their shoes. What did they know? What was their understanding of the situation? Rather than construing the
case as a loss of SA (which simply judges other people for not seeing what we, in our retrospective
omniscience, would have seen), there is more explanatory leverage in seeing the crew's actions as normal
processes of sensemaking of transactions between goals, observations, and actions. As Weick (1995) pointed
out, sensemaking is something that preserves plausibility and coherence, something that is reasonable and
memorable, something that embodies past experience and expectations, something that resonates with other
people, something that can be constructed retrospectively but also can be used prospectively, something that
captures both feeling and thought ... In short, what is necessary in sensemaking is a good story.
A good story holds disparate elements together long enough to energize and guide action, plausibly enough to
allow people to make retrospective sense of whatever happens, and engagingly enough that others will
contribute their own inputs in the interest of sensemaking. (p. 61) Even if one does make concessions to the
existence of elements, as Weick does, it is only for the role they play in constructing a plausible story of what
is going on, not for building an accurate mental simile of an external world somewhere "out there."
Chapter 6 Why Do Operators Become Complacent?
The introduction of powerful automation in a variety of transport applications has increased the emphasis on
human cognitive work. Human operators on, for example, ship bridges or aircraft flight decks spend much time
integrating data, planning activities, and managing a suite of machine resources in the conduct of their tasks.
This shift has contributed to the utility of a concept such as situation awareness. One large term can capture
the extent to which operators are in tune with relevant process data and can form a picture of the system and
its progress in space and time. As the Royal Majesty example in the chapter 5 showed, most high-tech
settings are actually not characterized by a single human interacting with a machine. In almost all cases,
multiple people—crews, or teams of operators —-jointly interact with the automated system in pursuit of
operational objectives. These crews or teams have to coordinate their activities with those of the system in
order to achieve common goals. Despite the weight that crews (and human factors researchers) repeatedly
attribute to having a shared understanding of their system state and problems to be solved, consensus in
transportation human factors on a concept of crew situation awareness seems far off. It appears that various
labels are used interchangeably to refer to the same basic phenomenon, for example, group situation
awareness, shared problem models, team situation awareness, mutual knowledge, shared mental models,
joint situation awareness, and shared understanding. At the same time, results about what constitutes
the phenomenon are fragmented and ideas on how to measure it remain divided. Methods to gain empirical
access range from modified measures of practitioner expertise, to questionnaires interjected into suddenly
frozen simulation scenarios, to implicit probes embedded in unfolding simulations of natural task behavior.
Most critically, however, a common definition or model of crew situation awareness remains elusive. There is
human factors research, for example, that claims to identify links between crew situation awareness and other
parameters (such as planning or crew-member roles).
But such research often does not mention a definition of the phenomenon. This renders empirical
demonstrations of the phenomenon unverifiable and inconclusive. After all, how can a researcher claim that he
or she saw something if that something was not defined? Perhaps there is no need to define the phenomenon,
because everybody knows what it means. Indeed, situation awareness is what we call a folk model. It has
come up from the practitioner community (fighter pilots in this case) to indicate the degree of coupling between
human and environment. Folk models are highly useful because they can collapse complex, multidimensional
problems into simple labels that everybody can relate to. But this is also where the risks lie, certainly when
researchers pick up on a folk label and attempt to investigate and model it scientifically.
Situation awareness is not alone in this. Human factors today has more concepts that aim to provide insight
into the human performance issues that underlie complex behavioral sequences. It is often tempting to mistake
the labels themselves for deeper insight -something that is becoming increasinglycommon in, for example,
accident analyses. Thus loss of situation awareness, automation complacency and loss of effective crew
resource management can now be found among the causal factors and conclusions in accident reports.
This happens without further specification of the psychological mechanisms responsible for the observed
behavior—much less how such mechanisms or behavior could have forced the sequence of events toward its
eventual outcome. The labels (modernist replacements of the old pilot error) are used to refer to concepts that
are intuitively meaningful. Everyone is assumed to understand or implicitly agree on them, yet no effort is
usually made to explicate or reach agreement on the underlying mechanisms or precise definitions. People
may no longer dare to ask what these labels mean, lest others suspect they are not really initiated in the
particulars of their business.
Indeed, large labels that correspond roughly to mental phenomena we know from daily life are deemed
sufficient—they need no further explanation. This is often accepted practice for psychological phenomena
because as humans we all have privileged knowledge about how the mind works (because we all have one).
However, a verifiable and detailed mapping between the context-specific (and measurable) particulars of a
behavior on the one hand and a concept-dependent model on the other is not achieved—the jump from
context specifics (somebody flying into a mountainside) to concept dependence (the operator must have lost
SA) is immune to critique or verification.
Folk models are not necessarily incorrect, but compared to articulated models they focus on descriptions
rather than explanations, and they are very hard to prove wrong. Folk models are pervasive in the history of
science. One well-known example of a folk model from modern times is Freud's psychodynamic model, which
links observable behavior and emotions to nonobservable structures (id, ego, superego) and their interactions.
One feature of folk models is that nonobservable constructs are endowed with the necessary causal power
without much specification of the mechanism responsible for such causation. According to Kern (1998), for
example, complacency can cause a loss of situation awareness. In other words, one folk problem causes
another folk problem. Such assertions leave few people any wiser. Because both folk problems are constructs
postulated by outside observers (and mostly post hoc), they cannot logically cause anything in the empirical
world. Yet this is precisely what they are assumed to be capable of. In wrapping up a conference on situation
awareness, Charles Billings warned against this danger in 1996:
The most serious shortcoming of the situation awareness construct as we have thought about it to date,
however, is that it's too neat, too holistic and too seductive. We heard here that deficient SA was a causal
factor in many airline accidents associated with human error.
We must avoid this trap: deficient situation awareness doesn't "cause" anything. Faulty spatial perception,
diverted attention, inability to acquire data in the time available, deficient decisionmaking, perhaps, but not a
deficient abstraction! (p. 3) What Billings did not mention is that "diverted attention" and "deficient decision-
making" themselves are abstractions at some level (and post hoc ones at that). They are nevertheless less
contentious because they provide a reasonable level of detail in their description of the psychological
mechanisms that account for their causation. Situation awareness is too "neat" and "holistic" in the sense that
it lacks such a level of detail and thus fails to account for a psychological mechanism that would connect
features of the sequence of events to the outcome. The folk model, however, was coined precisely because
practitioners (pilots) wanted something "neat" and "holistic" that could capture critical but inexplicit aspects of
their performance in complex, dynamic situations. We have to see their use of a folk model as legitimate.
It can fulfill a useful function with respect to the concerns and goals of a user community.
This does not mean that the concepts coined by users can be taken up and causally manipulated by scientists
without serious foundational analysis and explication of their meaning. Resisting the temptation, however, can
be difficult. After all, human factors is a discipline that lives by its applied usefulness. If the discipline does not
generate anything of interest to applied communities, then why would they bother funding the work? In this
sense, folk models can seem like a wonderfully convenient bridge between basic and applied worlds, between
scientific and practitioner communities. Terms like situation awareness allow both camps to speak the same
language.
But such conceptual sharing risks selling out to superficial validity. It may not do human factors a lot of good in
the long run, nor may it really benefit the practitioner consumers of research results. Another folk concept is
complacency. Why does people's vigilance decline over time, especially when confronted with repetitive
stimuli? Vigilance decrements have formed an interesting research problem ever since the birth of human
factors during and just after the Second World War.
The idea of complacency has always been related to vigilance problems. Although complacency connotes
something motivational (people must ensure that they watch the process carefully), the human factors
literature actually has little in the way of explanation or definition. What is complacency? Why does it occur?
If you want answers to these questions, do not turn to the human factors literature. You will not find answers
there.
Complacency is one of those constructs, whose meaning is assumed to be known by everyone. This justifies
taking it up in scientific discourse as something that can be manipulated or studied as an independent or
dependent variable, without having to go through the bother of defining what it actually is or how it works. In
other words, complacency makes a "neat" and "holistic" case for studying folk models.
DEFINITION BY SUBSTITUTION
The most evident characteristic of folk models is that they define their central constructs by substitution rather
than decomposition. A folk concept is explained simply by referring to another phenomenon or construct that
itself is in equal need of explanation. Substitution is not the same as decomposition:
Substituting replaces one high-level label with another, whereas decomposition takes the analysis down into
subsequent levels of greater detail, which transform the high-level concept into increasingly measurable
context specifics. A good example of definition by substitution is the label complacency, in relation to the
problems observed on automated flight decks. Most textbooks on aviation human factors talk about
complacency and even endow it with causal power, but none really define (i.e., decompose) it:
• According to Wiener (1988, p. 452), "boredom and complacency are often mentioned" in connection with the
out-of-the-loop issue in automated cockpits. But whether complacency causes an out-of-the-loop condition
or whether it is the other way around is left unanswered.
• O'Hare and Roscoe (1990, p. 117) stated that "because autopilots have proved extremely reliable, pilots tend
to become complacent and fail to monitor them." Complacency, in other words, is invoked to explain monitor
failures.
• Kern (1998, p. 240) maintained that "as pilots perform duties as system monitors they will be lulled into
complacency, lose situational awareness, and not be prepared to react in a timely manner when the system
fails." Thus, complacency can cause a loss of situational awareness. But how this occurs is left to the imagination.
• On the same page in their textbook, Campbell and Bagshaw (1991, p. 126) said that complacency is both a
"trait that can lead to a reduced awareness of danger," and a "state of confidence plus contentment" (emphasis
added). In other words, complacency is at the same time a long-lasting, enduring feature of personality (a trait)
and a shorter lived, transient phase in performance (a state).
• For the purpose of categorizing incident reports, Parasuraman, Molly, and Singh (1993, p. 3) defined
complacency as: "self-satisfaction which may result in non-vigilance based on an unjustified assumption of
satisfactory system state." This is part definition but also part substitution: Self-satisfaction takes the place of
complacency and is assumed to speak for itself. There is no need to make explicit by which psychological
mechanism selfsatisfaction arises or how it produces nonvigilance.
It is in fact difficult to find real content on complacency in the human factors literature. The phenomenon is
often described or mentioned in relation to some deviation or diversion from official guidance (people should
coordinate, double-check, look but they do not), which is both normativist and judgmental. The "unjustified
assumption of satisfactory system state" in Parasuraman et al.'s (1993) definition is emblematic for human
factors' understanding of work by reference to externally dictated norms. If we want to understand
complacency, the whole point is to analyze why the assumption of satisfactory system state is justified (not
unjustified) by those who are making that assumption. If it were unjustified, and they knew that, they would not
make the assumption and would consequently not become complacent. Saying that an assumption of
satisfactory system state is unjustified (but people still keep making it—they must be motivationally deficient)
does not explain much at all. None of the above examples really provide a definition of complacency.
Instead, complacency is treated as self-evident (everybody knows what it means, right?) and thus it can be
defined by substituting one label for another. The human factors literature equates complacency with many
different labels, including boredom, overconfidence, contentment, unwarranted faith, overreliance, self-
satisfaction, and even a low index of suspicion. So if we would ask, "What do you mean by 'complacency'?,"
and the reply is,
"Well, it is self-satisfaction," we can be expected to say, "Oh, of course, now I understand what you mean." But
do we really? Explanation by substitution actually raises more questions than it answers. By failing to propose
an articulated psychological mechanism responsible for the behavior observed, we are left to wonder. How is it
that complacency produces vigilance decrements or how is it that complacency leads to a loss of situation
awareness? The explanation could be a decay of neurological connections, fluctuations in learning and
motivation, or a conscious trade-off between competing goals in a changing environment. Such definitions,
which begin to operationalize the large concept of complacency, suggest possible probes that a researcher
could use to monitor for the target effect. But because none of the descriptions of complacency available today
offer any such roads to insight, claims that complacency was at the heart of a sequence of events are
immune to critique and falsification.
IMMUNITY AGAINST FALSIFICATION
Most philosophies of science rely on the empirical world as touchstone or ultimate arbiter (a reality check) for
postulated theories. Following Popper's rejection of the inductive method in the empirical sciences, theories
and hypotheses can only be deductively validated by means of falsifiability.
This usually involves some form of empirical testing to look for exceptions to the postulated hypothesis, where
the absence of contradictory evidence becomes corroboration of the theory. Falsification deals with the central
weakness of the inductive method of verification, which, as pointed out by David Hume, requires an infinite
number of confirming empirical demonstrations. Falsification, on the other hand, can work on the basis of only
one empirical instance, which proves the theory wrong. As seen in chapter 3, this is of course a highly
idealized, almost clinical conceptualization of the scientific enterprise. Yet, regardless, theories that do not
permit falsification at all are highly suspect.
The resistance of folk models against falsification is known as immunization.
Folk models leave assertions about empirical reality underspecified, without a trace for others to follow or
critique. For example, a senior training captain once asserted that cockpit discipline is compromised when any
of the following attitudes are prevalent: arrogance, complacency, and overconfidence.
Nobody can disagree because the assertion is underspecified and therefore immune against falsification. This
is similar to psychoanalysts claiming that obsessive-compulsive disorders are the result of overly harsh toilet
training that fixated the individual in the anal stage. In the same vein, if the question of "Where are we
headed?" from one pilot to the other is interpreted as a loss of situation awareness (Aeronautica Civil, 1996),
this claim is immune against falsification. The journey from context-specific behavior (people asking questions)
to the postulated psychological mechanism (loss of situation awareness) is made in one big leap, leaving no
trace for others to follow or critique.
Current theories of situation awareness are not sufficiently articulated to be able to explain why asking
questions about direction represents a loss of situation awareness. Some theories may superficially appear to
have the characteristics of good scientific models, yet just below the surface they lack an articulated
mechanism that is amenable to falsification. Although falsifiability may at first seem like a self-defeating
criterion for scientific progress, the opposite is true: The most falsifiable models are usually also the most
informative ones, in the sense that they make stronger and more demonstrable claims about reality. In other
words, falsifiability and informativeness are two sides of the same coin.
Folk Models Versus Young and Promising Models
One risk in rejecting folk models is that the baby is thrown out with the bath water. In other words, there is the
risk of rejecting even those models that may be able to generate useful empirical results, if only given the time
and opportunity to do so. Indeed, the more articulated human factors constructs (such as decision making,
diagnosis) are distinguished from the less articulated ones (situation awareness, complacency) in part by their
maturity, by how long they have been around in the discipline. What opportunity should the younger ones
receive before being rejected as unproductive? The answer to this question hinges, once again, on falsifiability.
Ideal progress in science is described as the succession of theories, each of which is more falsifiable (and thus
more informative) than the one before it. Yet when we assess loss of situation awareness or complacency as
more novel explanations of phenomena that were previously covered by other explanations, it is easy to see
that falsifiability has actually decreased, rather than increased. Take as an example an automation-related
accident that occurred when situation awareness or automation-induced complacency did not yet exist in 1973.
The aircraft in question was on approach in rapidly changing weather conditions. It was equipped with a
slightly deficient flight director (a device on the central instrument panel showing the pilot where to go, based
on an unseen variety of sensory inputs), which the captain of the airplane distrusted. The airplane struck a
seawall bounding Boston's Logan Airport about 1 kilometer short of the runway and slightly to the side of it,
killing all 89 people onboard. In its comment on the crash, the National Transportation Safety Board explained
how an accumulation of discrepancies, none critical in themselves, can rapidly deteriorate into a high-risk
situation without positive flight management. The first officer, who was flying, was preoccupied with the
information presented by his flight-director systems, to the detriment of his attention to altitude, heading and
airspeed control (NTSB, 1974).
Today, both automation-induced complacency of the first officer and a loss of situation awareness of the entire
crew could likely be cited under the causes of this crash. (Actually, that the same set of empirical phenomena
can comfortably be grouped under either label—complacency or loss of situation awareness—is additional
testimony to the undifferentiated and underspecified nature of these concepts.)
These supposed explanations (complacency, loss of situation awareness) were obviously not needed in
1974 to deal with this accident. The analysis left us instead with more detailed, more falsifiable, and more
traceable assertions that linked features of the situation (e.g., an accumulation of discrepancies) with
measurable or demonstrable aspects of human performance (diversion of attention to the flight director vs.
other sources of data). The decrease of falsifiability represented by complacency and situation awareness as
hypothetical contenders in explaining this crash represents the inverse of scientific progress, and therefore
argues for the rejection of such novel concepts.
OVERGENERALIZATION
The lack of specificity of folk models and the inability to falsify them contribute to their overgeneralization. One
famous example of overgeneralization in psychology is the inverted-U curve, also known as the Yerkes-
Dodson law. Ubiquitous in human factors textbooks, the inverted-U curve couples arousal with performance
(without clearly stating any units of either arousal or performance), where a person's best performance is
claimed to occur between too much arousal (or stress) and too little, tracing a sort of hyperbole. The original
experiments were, however, neither about performance nor about arousal (Yerkes & Dodson, 1908). They
were not even about humans. Examining "the relation between stimulus strength and habit formation," the
researchers subjected laboratory rats to electrical shocks to see how quickly they decided to take a particular
pathway versus another. The conclusion was that rats learn best (that is, they form habits most rapidly) at any
but the highest or lowest shock. The results approximated an inverted U only with a most generous curve
fitting, the x axis was never defined in psychological terms but in terms of shock strength, and even this was
confounded: Yerkes and Dodson used different levels of shock which were too poorly calibrated to know how
different they really were. The subsequent overgeneralization of the Yerkes-Dodson results (to no fault
of their own, incidentally) has confounded stress and arousal, and after a century there is still little evidence
that any kind of inverted-U relationship holds for stress (or arousal) and human performance.
Overgeneralizations take narrow laboratory findings and apply them uncritically to any broad situation where
behavioral particulars bear some prima-facie resemblance to the phenomenon that was investigated under
controlled circumstances. Other examples of overgeneralization and overapplication include perceptual
tunneling (putatively championed by the crew of an airliner that descended into the Everglades after its
autopilot was inadvertently switched off) and the loss of effective Crew Resource Management (CRM) as
major explanations of accidents (e.g., Aeronautica Civil, 1996). A most frequently quoted sequence of events
with respect to CRM is the flight of an iced-up airliner from Washington National Airport in the winter of 1982
that ended shortly after takeoff on the 14th Street bridge and in the Potomac River.
The basic cause of the accident is said to be the copilot's unassertive remarks about an irregular engine
instrument reading (despite the fact that the copilot was known for his assertiveness). This supposed
explanation hides many other factors which might be more relevant, including airtraffic control pressures, the
controversy surrounding rejected takeoffs close to decision speed, the sensitivity of the aircraft type to icing
and its pitch-up tendency with even little ice on the slats (devices on the wing's leading edge that help it fly at
slow speeds), and ambiguous engineering language in the airplane manual to describe the conditions for use
of engine anti-ice. In an effort to explain complex behavior, and still make a connection to the applied worlds
from which it owes its existence, transportation human factors may be doing itself a disservice by inventing and
uncritically using folk models. If we use models that do not articulate the performance measures that can be
used in the particular contexts that we want to speak about, we can make no progress in better understanding
the sources of success and failure in our operational environments.
7 Chapter
Why Don't They Follow
the Procedures?
People do not always follow procedures. We can easily observe this when
watching people at work, and managers, supervisors and regulators (or anybody
else responsible for safe outcomes of work) often consider it to be a
large practical problem. In chapter 6 we saw how complacency would be a
very unsatisfactory label for explaining practical drift away from written
guidance. But what lies behind it then?
In hindsight, after a mishap, rule violations seem to play such a dominant
causal role. If only they had followed the procedure! Studies keep returning
the basic finding that procedure violations precede accidents. For
example, an analysis carried out for an aircraft manufacturer identified "pilot
deviation from basic operational procedure" as primary factor in almost
100 accidents (Lautman & Gallimore, 1987, p. 2). One methodological
problem with such work is that it selects its cases on the dependent variable
(the accident), thereby generating tautologies rather than findings. But
performance variations, especially those at odds with written guidance, easily
get overestimated for their role in the sequence of events:
The interpretation of what happened may then be distorted by naturalistic biases
to overestimate the possible causal role of unofficial action or procedural
violation. . . . While it is possible to show that violations of procedures are involved
in many safety events, many violations of procedures are not, and indeed
some violations (strictly interpreted) appear to represent more effective
ways of working. (McDonald, Corrigan, & Ward, 2002, pp. 3-5)
132
PROCEDURES 133
As seen in chapter 4, hindsight turns complex, tangled histories laced with
uncertainty and pressure into neat, linear anecdotes with obvious choices.
What look like violations from the outside and hindsight are often actions
that make sense given the pressures and trade-offs that exist on this inside
of real work. Finding procedure violations as causes or contributors to mishaps,
in other words, says more about us, and the biases we introduce when
looking back on a sequence of events, than it does about people who were
doing actual work at the time.
Yet if procedure violations are judged to be such a large ingredient of
mishaps, then it can be tempting, in the wake of failure, to introduce even
more procedures, or to change existing ones, or to enforce stricter compliance.
For example, shortly after a fatal shootdown of two U.S. Black Hawk
helicopters over Northern Iraq by U.S. fighter jets, "higher headquarters in
Europe dispatched a sweeping set of rules in documents several inches
thick to 'absolutely guarantee' that whatever caused this tragedy would
never happen again" (Snook, 2000, p. 201). It is a common, but not typically
satisfactory, reaction. Introducing more procedures does not necessarily
avoid the next incident, nor do exhortations to follow rules more carefully
necessarily increase compliance or enhance safety. In the end, a
mismatch between procedures and practice is not unique to accident sequences.
Not following procedures does not necessarily lead to trouble,
and safe outcomes may be preceded by just as many procedural deviations
as accidents are.
PROCEDURE APPLICATION AS RULE-FOLLOWING
When rules are violated, are these bad people ignoring the rules? Or are
these bad rules, ill matched to the demands of real work? To be sure, procedures,
with the aim of standardization, can play an important role in
shaping safe practice. Commercial aviation is often held up as a prime example
of the powerful effect of standardization on safety. But there is a
deeper, more complex dynamic where real practice is continually adrift
from official written guidance, settling at times, unsettled and shifting at
others. There is a deeper, more complex interplay whereby practice sometimes
precedes and defines the rules rather than being defined by them.
In those cases, is a violation an expression of defiance, or an expression of
compliance—people following practical rules rather than official, impractical
ones?
These possibilities lie between two opposing models of what procedures
mean, and what they in turn mean for safety. These models of procedures
guide how organizations think about making progress on safety. The first
134 CHAPTER7
model is based on the notion that not following procedures can lead to unsafe
situations. These are its premises:
• Procedures represent the best thought-out, and thus the safest, way to
carry out a job.
• Procedure following is mostly simple IF-THEN rule-based mental activity:
IF this situation occurs, THEN this algorithm (e.g., checklist) applies.
• Safety results from people following procedures.
• For progress on safety, organizations must invest in people's knowledge
of procedures and ensure that procedures are followed.
In this idea of procedures, those who violate them are often depicted as putting
themselves above the law. These people may think that rules procedures
are made for others, but not for them, as they know how to really do the job.
This idea of rules and procedures suggests that there is something exceptionalist
or misguidedly elitist about those who choose not to follow the
rules. After a maintenance related mishap, for example, investigators found
that "the engineers who carried out the flap change demonstrated a willingness
to work around difficulties without reference to the design authority, including
situations where compliance with the maintenance manual could
not be achieved" (Joint Aviation Authorities, 2001). The engineers demonstrated
a "willingness." Such terminology embodies notions of volition (the
engineers had a free choice either to comply or not) and full rationality (they
knew what they were doing). They violated willingly. Violators are wrong, because
rules and procedures prescribe the best, safest way to do a job, independent
of who does that job. Rules and procedures are for everyone.
Such characterizations are naive at best, and always misleading. If you
know where to look, daily practice is testimony to the ambiguity of procedures,
and evidence that procedures are a rather problematic category of
human work. First, real work takes place in a context of limited resources
and multiple goals and pressures. Procedures assume that there is time to
do them in, certainty (of what the situation is), and sufficient information
available (e.g., about whether tasks are accomplished according to the procedure).
This already keeps rules at a distance from actual tasks, because
real work seldom meets those criteria. Work-to-rule strikes show how it can
be impossible to follow the rules and get the job done at the same time.
Aviation line maintenance is emblematic: A job-perception gap exists
where supervisors are convinced that safety and success result from mechanics
following procedures—a sign-off means that applicable procedures
were followed. But mechanics may encounter problems for which the right
PROCEDURES 135
tools or parts are not at hand; the aircraft may be parked far away from
base. Or there may be too little time: Aircraft with a considerable number
of problems may have to be turned around for the next flight within half an
hour. Mechanics, consequently, see success as the result of their evolved
skills at adapting, inventing, compromising, and improvising in the face of
local pressures and challenges on the line—a sign-off means the job was accomplished
in spite of resource limitations, organizational dilemmas, and
pressures. Those mechanics who are most adept are valued for their productive
capacity even by higher organizational levels. Unacknowledged by
those levels, though, are the vast informal work systems that develop so mechanics
can get work done, advance their skills at improvising and
satisficing, impart them to one another, and condense them in unofficial,
self-made documentation (McDonald et al., 2002). Seen from the outside, a
defining characteristic of such informal work systems would be routine
nonconformity. But from the inside, the same behavior is a mark of expertise,
fueled by professional and interpeer pride. And of course, informal
work systems emerge and thrive in the first place because procedures are inadequate
to cope with local challenges and surprises, and because procedures'
conception of work collides with the scarcity, pressure and multiple
goals of real work.
Some of the safest complex, dynamic work not only occurs despite the
procedures—such as aircraft line maintenance—but without procedures altogether.
Rochlin et al. (1987, p. 79), commenting on the introduction of
ever heavier and capable aircraft onto naval aircraft carriers, noted that
"there were no books on the integration of this new hardware into existing
routines and no other place to practice it but at sea. Moreover, little of the
process was written down, so that the ship in operation is the only reliable
manual." Work is "neither standardized across ships nor, in fact, written
down systematically and formally anywhere." Yet naval aircraft carriers, with
inherent high-risk operations, have a remarkable safety record, like other
so-called high-reliability organizations (Rochlin, 1999; Rochlin, LaPorte, &
Roberts, 1987). Documentation cannot present any close relationship to situated
action because of the unlimited uncertainty and ambiguity involved
in the activity. Especially where normal work mirrors the uncertainty and
criticality of emergencies, rules emerge from practice and experience
rather than preceding it. Procedures, in other words, end up following
work instead of specifying action beforehand. Human factors has so far
been unable to trace and model such coevolution of human and system, of
work and rules. Instead, it has typically imposed a mechanistic, static view of
one best practice from the top down.
Procedure-following can also be antithetical to safety. In the 1949 U.S.
Mann Gulch disaster, firefighters who perished were the ones sticking to
136 CHAPTER 7
the organizational mandate to carry their tools everywhere (Weick, 1993).
In this case, as in others (e.g., Carley, 1999), people faced the choice between
following the procedure or surviving.
Procedures Are Limited in Rationalizing Human Work
This, then, is the tension. Procedures are seen as an investment in safety—
but it turns out that they not always are. Procedures are thought to be required
to achieve safe practice—yet they are not always necessary, nor likely
ever sufficient for creating safety. Procedures spell out how to do the job
safely—yet following all the procedures can lead to an inability to get the
job done. Though a considerable practical problem, such tensions are
underreported and underanalyzed in the human factors literature.
There is always a distance between a written rule and an actual task. This
distance needs to be bridged; the gap must be closed, and the only thing
that can close it is human interpretation and application. Ethnographer Ed
Hutchins has pointed out how procedures are not just externalized cognitive
tasks (Wright & McCarthy, 2003). Externalizing a cognitive task would
transplanted it from the head to the world, for example onto a checklist.
Rather, following a procedure requires cognitive tasks that are not specified
in the procedure; transforming the written procedure into activity requires
cognitive work. Procedures are inevitably incomplete specifications of action:
They contain abstract descriptions of objects and actions that relate
only loosely to particular objects and actions that are encountered in the actual
situation (Suchman, 1987). Take as an example the lubrication of the
jackscrew on MD-80s from chapter 2—something that was done incompletely
and at increasingly greater intervals before the crash of Alaska 261.
This is part of the written procedure that describes how the lubrication
work should be done (NTSB, 2002, pp. 29-30):
A. Open access doors 6307, 6308, 6306 and 6309
B. Lube per the following . . .
3. JACKSCREW
Apply light coat of grease to threads, then operate mechanism through
full range of travel to distribute lubricant over length of jackscrew.
C. Close doors 6307, 6308, 6306 and 6309
This leaves a lot to the imagination, or to the mechanic's initiative. How
much is a "light" coat? Do you do apply the grease with a brush (if a "light
coat" is what you need), or do you pump it onto the parts directly with the
grease gun? How often should the mechanism (jackscrew plus nut) be op-
PROCEDURES 137
crated through its full range of travel during the lubrication procedure?
None of this is specified in the written guidance. It is little wonder that:
Investigators observed that different methods were used by maintenance personnel
to accomplish certain steps in the lubrication procedure, including
the manner in which grease was applied to the acme nut fitting and the acme
screw and the number of times the trim system was cycled to distribute the
grease immediately after its application. (NTSB, 2002, p. 116)
In addition, actually carrying out the work is difficult enough. As noted
in chapter 2, the access panels of the horizontal stabilizer were just large
enough to allow a hand through, which would then block the view of anything
that went on inside. As a mechanic, you can either look at what you
have to do or what you have just done, or actually do it. You cannot do both
at the same time, because the access doors are too small. This makes judgments
about how well the work is being done rather difficult. The investigation
discovered as much when they interviewed the mechanic responsible
for the last lubrication of the accident airplane: "When asked how he determined
whether the lubrication was being accomplished properly and when
to stop pumping the grease gun, the mechanic responded, 'I don't' "
(NTSB, 2002, p. 31).
The time the lubrication procedure took was also unclear, as there was
ambiguity about which steps were included in the procedure. Where does
the procedure begin and where does it end, after access has been created to
the area, or before? And is closing the panels part of it as well, as far as time
estimates are concerned? Having heard that the entire lubrication process
takes "a couple of hours," investigators learned from the mechanic of the
accident airplane that:
the lubrication task took "roughly. . . probably an hour" to accomplish. It was
not entirely clear from his testimony whether he was including removal of the
access panels in his estimate. When asked whether his 1-hour estimate included
gaining access to the area, he replied, "No, that would probably take a
little—well, you've got probably a dozen screws to take out of the one panel,
so that's—I wouldn't think any more than an hour." The questioner then
stated, "including access?," and the mechanic responded, 'Yeah." (NTSB,
2002, p. 32)
As the procedure for lubricating the MD-80 jackscrew indicates, and McDonald
et al. (2002) remind us, formal documentation cannot be relied
on, nor is it normally available in a way which supports a close relationship
to action. There is a distinction between universalistic and particularistic
rules: Universalistic rules are very general prescriptions (e.g., "Apply light
coat of grease to threads"), but remain at a distance from their actual appli-
138 CHAPTER7
cation. In fact, all universalistic rules or general prescriptions develop into
particularistic rules as experience accumulates. With experience, people
encounter the conditions under which universalistic rules need to be applied,
and become increasingly able to specify those conditions. As a result,
universalistic rules assume appropriate local expressions through practice.
Wright and McCarthy (2003) have pointed out that procedures come
out of the scientific management tradition, where their main purpose was a
minimization of human variability, maximization of predictability, a rationalization
of work. Aviation contains a strong heritage: Procedures in commercial
aviation represent and allow a routinization that makes it possible
to conduct safety-critical work with perfect strangers. Procedures are a substitute
for knowing coworkers. The actions of a copilot are predictable not
because the copilot is known (in fact, you may never have flown with him or
her), but because the procedures make them predictable. Without such
standardization it would be impossible to cooperate safely and smoothly
with unknown people.
In the spirit of scientific management, human factors also assumes that
order and stability in operational systems are achieved rationally, mechanistically,
and that control is implemented vertically (e.g., through task analyses
that produce prescriptions of work to be carried out). In addition, the
strong influence of information-processing psychology on human factors
has reinforced the idea of procedures as IF-THEN rule following, where
procedures are akin to a program in a computer that in turn serves as input
signals to the human information processor. The algorithm specified by
the procedure becomes the software on which the human processor runs.
But it is not that simple. Following procedures in the sense of applying
them in practice requires more intelligence. It requires additional cognitive
work. This brings us to the second model of procedures and safety.
PROCEDURE APPLICATION AS SUBSTANTIVE
COGNITIVE ACTIVITY
People at work must interpret procedures with respect to a collection of actions
and circumstances that the procedures themselves can never fully
specify (e.g., Suchman, 1987). In other words, procedures are not the work
itself. Work, especially that in complex, dynamic workplaces, often requires
subtle, local judgments with regard to timing of subtasks, relevance, importance,
prioritization, and so forth. For example, there is no technical reason
why a before-landing checklist in a commercial aircraft could not be automated.
The kinds of items on such a checklist (e.g., hydraulic pumps,
gear, flaps) are mostly mechanical and could be activated on the basis of
predetermined logic without having to rely on, or constantly remind, a hu-
PROCEDURES 139
man to do so. Yet no before-landing checklist is fully automated today. The
reason is that approaches for landing differ—they can differ in terms of
timing, workload, or other priorities. Indeed, the reason is that the checklist
is not the job itself. The checklist is, to repeat Suchman, a resource for
action; it is one way for people to help structure activities across roughly
similar yet subtly different situations. Variability in this is inevitable. Circumstances
change, or are not as foreseen by those who designed the procedures.
Safety, then, is not the result of rote rule following; it is the result
of people's insight into the features of situations that demand certain actions,
and people being skillful at finding and using a variety of resources
(including written guidance) to accomplish their goals. This suggests a second
model on procedures and safety:
• Procedures are resources for action. Procedures do not specify all circumstances
to which they apply. Procedures cannot dictate their own
application.
• Applying procedures successfully across situations can be a substantive
and skillful cognitive activity.
• Procedures cannot, in themselves, guarantee safety. Safety results from
people being skillful at judging when and how (and when not) to
adapt procedures to local circumstances.
• For progress on safety, organizations must monitor and understand
the reasons behind the gap between procedures and practice. Additionally,
organizations must develop ways that support people's skill at
judging when and how to adapt.
Procedures and Unusual Situations
Although there is always a distance between the logics dictated in written
guidance and real actions to be taken in the world, prespecified guidance is
especially inadequate in the face of novelty and uncertainty. Adapting procedures
to fit unusual circumstances is a substantive cognitive activity. Take
for instance the crash of a large passenger aircraft near Halifax, Nova Scotia
in 1998. After an uneventful departure, a burning smell was detected and,
not much later, smoke was reported inside the cockpit. Carley (1999) characterized
the two pilots as respective embodiments of the models of procedures
and safety: The co-pilot preferred a rapid descent and suggested
dumping fuel early so that the aircraft would not be too heavy to land. But
the captain told the copilot, who was flying the plane, not to descend too
fast, and insisted they cover applicable procedures (checklists) for dealing
with smoke and fire. The captain delayed a decision on dumping fuel. With
the fire developing, the aircraft became uncontrollable and crashed into
140 CHAPTER 7
the sea, taking all 229 lives onboard with it. There were many good reasons
for not immediately diverting to Halifax: Neither pilot was familiar with the
airport, they would have to fly an approach procedure that they were not
very proficient at, applicable charts and information on the airport were
not easily available, and an extensive meal service had just been started in
the cabin.
Yet, part of the example illustrates a fundamental double bind for those
who encounter surprise and have to apply procedures in practice (Woods&
Shattuck, 2000):
• If rote rule following persists in the face of cues that suggest procedures
should be adapted, this may lead to unsafe outcomes. People can
get blamed for their inflexibility, their application of rules without sensitivity
to context.
• If adaptations to unanticipated conditions are attempted without complete
knowledge of circumstance or certainty of outcome, unsafe results
may occur too. In this case, people get blamed for their deviations
their nonadherence.
In other words, people can fail to adapt, or attempt adaptations that may
fail. Rule following can become a desynchronized and increasingly irrelevant
activity, decoupled from how events and breakdowns are really unfolding
and multiplying throughout a system. In the Halifax crash, as is often
the case, there was uncertainty about the very need for adaptations (How
badly ailing was the aircraft, really?) as well as uncertainty about the effect
and safety of adapting: How much time would the crew have to change
their plans? Could they skip fuel dumping and still attempt a landing? Potential
adaptations, and the ability to project their potential for success,
were not necessarily supported by specific training or overall professional
indoctrination. Civil aviation, after all, tends to emphasize model the first
model: Stick with procedures and you will most likely be safe (e.g., Lautman
& Gallimore, 1987).
Tightening procedural adherence, through threats of punishment or
other supervisory interventions, does not remove the double bind. In fact,
it may tighten the double bind—making it more difficult for people to develop
judgment of how and when to adapt. Increasing the pressure to comply
increases the probability of failures to adapt—compelling people to
adopt a more conservative response criterion. People will require more evidence
for the need to adapt, which takes time, and time may be scarce in
cases that call for adaptation (as in the aforementioned case). Merely stressing
the importance of following procedures can increase the number of
cases in which people fail to adapt in the face of surprise.
PROCEDURES 141
Letting people adapt without adequate skill or preparation, on the other
hand, can increase the number of failed adaptations. One way out of the
double bind is to develop people's skill at adapting. This means giving them
the ability to balance the risks between the two possible types of failure: failing
to adapt or attempting adaptations that may fail. It requires the development
of judgment about local conditions and the opportunities and risks
they present, as well as an awareness of larger goals and constraints that operate
on the situation. Development of this skill could be construed, to
paraphrase Rochlin (1999), as planning for surprise. Indeed, as Rochlin (p.
1549) observed, the culture of safety in high-reliability organizations anticipates
and plans for possible failures in "the continuing expectation of future
surprise."
Progress on safety also hinges on how an organization responds in the
wake of failure (or even the threat of failure). Post-mortems can quickly reveal
a gap between procedures and local practice, and hindsight inflates the
causal role played by unofficial action (McDonald et al, 2002). The response,
then, is often to try to forcibly close the gap between procedures
and practice, by issuing more procedures or policing practice more closely.
The role of informal patterns of behavior, and what they represent (e.g., resource
constraints, organizational deficiencies or managerial ignorance,
countervailing goals, peer pressure, professionalism and perhaps even better
ways of working) all go misunderstood. Real practice, as done in the vast informal
work systems, is driven and kept underground. Even though failures
offer each sociotechnical system an opportunity for critical self-examination,
accident stories are developed in which procedural deviations play a major,
evil role, and are branded as deviant and causal. The official reading of how
the system works or is supposed to work is once again re-invented: Rules
mean safety, and people should follow them. High-reliability organizations,
in contrast, distinguish themselves by their constant investment in trying to
monitor and understand the gap between procedures and practice. The
common reflex is not to try to close the gap, but to understand why it exists.
Such understanding provides insight into the grounds for informal patterns
of activity and opens ways to improve safety by sensitivity to people's
local operational context.
The Regulator: From Police to Partner
That there is always a tension between centralized guidance and local practice
creates a clear dilemma for those tasked with regulating safety-critical
industries. The dominant regulatory instrument consists of rules and
checking that those rules are followed. But forcing operational people to
stick to rules can lead to ineffective, unproductive or even unsafe local actions.
For various jobs, following the rules and getting the task done are mu-
142 CHAPTER7
tually exclusive. On the other hand, letting people adapt their local practice
in the face of pragmatic demands can make them sacrifice global
system goals or miss other constraints or vulnerabilities that operate on the
system. Helping people solve this fundamental trade-off is not a matter of
pushing the criterion one way or the other. Discouraging people's attempts
at adaptation can increase the number of failures to adapt in situations
where adaptation was necessary. Allowing procedural leeway without encouraging
organizations to invest in people's skills at adapting, on the other
hand, can increase the number of failed attempts at adaptation.
This means that the gap between rule and task, between written procedure
and actual job, needs to be bridged by the regulator as much as by the
operator. Inspectors who work for regulators need to apply rules as well:
find out what exactly the rules mean and what their implications are when
imposed on a field of practice. The development from universalism to
particularism applies to regulators too. This raises questions about the role
that inspectors should play. Should they function as police—checking to
what extent the market is abiding by the laws they are supposed to uphold?
In that case, should they apply a black-and-white judgment (which would
ground a number of companies immediately)? Or, if there is a gap between
procedure and practice that inspectors and operators share and both need
to bridge, can inspectors be partners in joint efforts toward progress on
safety? The latter role is one that can only develop in good faith, though
such good faith may be the very by-product of the development of a new
kind of relationship, or partnership, towards progress on safety. Mismatches
between rules and practice are no longer seen as the logical conclusion
of an inspection, but rather as the starting point, the beginning of
joint discoveries about real practice and the context in which it occurs.
What are the systemic reasons (organizational, regulatory, resource related)
that help create and sustain the mismatch?
The basic criticism of an inspector's role as partner is easy to anticipate:
Regulators should not come too close to the ones they regulate, lest their relationship
become too cozy and objective judgment of performance against
safety criteria become impossible. But regulators need to come close to those
they regulate in any case. Regulators (or their inspectors) need to be insiders
in the sense of speaking the language of the organization they inspect, understanding
the kind of business they are in, in order to gain the respect and
credibility of the informants they need most. At the same time, regulators
need to be outsiders—resisting getting integrated into the worldview of the
one they regulate. Once on the inside of that system and its worldview, it may
be increasingly difficult to discover the potential drift into failure. What is
normal to the operator is normal to the inspector.
The tension between having to be an insider and an outsider at the same
time is difficult to resolve. The conflictual, adversarial model of safety regu-
PROCEDURES 143
lation has in many cases not proven productive. It leads to window dressing
and posturing on the part of the operator during inspections, and secrecy
and obfuscation of safety- and work-related information at all other times.
As airline maintenance testifies, real practice is easily driven underground.
Even for regulators who apply their power as police rather than as partner,
the struggle of having to be insider and outsider at the same time is not automatically
resolved. Issues of access to information (the relevant information
about how people really do their work, even when the inspector is not
there) and inspector credibility, demand that there be a relationship between
regulator and operator that allows such access and credibility to develop.
Organizations (including regulators) who wish to make progress on
safety with procedures need to:
• Monitor the gap between procedure and practice and try to understand
why it exists (and resist trying to close it simply telling people to
comply).
• Help people develop skills to judge when and how to adapt (and resist
only telling people they should follow procedures).
But many organizations or industries do neither. They may not even know,
or want to know (or be able to afford to know) about the gap. Take aircraft
maintenance again. A variety of workplace factors (communication problems,
physical or hierarchical distance, industrial relations) obscure the
gap. For example, continued safe outcomes of existing practice give supervisors
no reason to question their assumptions about how work is done (if
they are safe they must be following procedures down there). There is
wider industry ignorance, however (McDonald et al., 2002). In the wake of
failure, informal work systems typically retreat from view, gliding out of investigators'
reach. What goes misunderstood, or unnoticed, is that informal
work systems compensate for the organization's inability to provide the basic
resources (e.g., time, tools, documentation with a close relationship to
action) needed for task performance. Satisfied that violators got caught and
that formal prescriptions of work were once again amplified, the organizational
system changes little or nothing. It completes another cycle of stability,
typified by a stagnation of organizational learning and no progress on
safety (McDonald et al.).
GOAL CONFLICTS AND PROCEDURAL DEVIANCE
As discussed in chapter 2, a major engine behind routine divergence from
written guidance is the need to pursue multiple goals simultaneously. Multiple
goals mean goal conflicts. As Dorner (1989) remarked, "Contradictory
144 CHAPTER 7
goals are the rule, not the exception, in complex situations" (p. 65). In a
study of flight dispatchers, for example, Smith (2001) illustrated the basic
dilemma. Would bad weather hit a major hub airport or not? What should
the dispatchers do with all the airplanes en route? Safety (by making aircraft
divert widely around the weather) would be a pursuit that "tolerates a
false alarm but deplores a miss" (p. 361). In other words, if safety is the major
goal, then making all the airplanes divert even if the weather would not
end up at the hub (a false alarm) is much better than not making them divert
and sending them headlong into bad weather (a miss). Efficiency, on
the other hand, severely discourages the false alarm, whereas it can actually
deal with a miss.
As discussed in chapter 2, this is the essence of most operational systems.
Though safety is a (stated) priority, these systems do not exist to be safe.
They exist to provide a service or product, to achieve economic gain, to
maximize capacity utilization. But still they have to be safe. One starting
point, then, for understanding a driver behind routine deviations, is to look
deeper into these goal interactions, these basic incompatibilities in what
people need to strive for in their work. Of particular interest is how people
themselves view these conflicts from inside their operational reality, and
how this contrasts with management (and regulator) views of the same activities.
NASA's "Faster, Better, Cheaper" organizational philosophy in the late
1990s epitomized how multiple, contradictory goals are simultaneously
present and active in complex systems. The loss of the Mars Climate Orbiter
and the Mars Polar Lander in 1999 were ascribed in large part to the irreconcilability
of the three goals (faster and better and cheaper), which drove
down the cost of launches, made for shorter, aggressive mission schedules,
eroded personnel skills and peer interaction, limited time, reduced the
workforce, and lowered the level of checks and balances normally found
(National Aeronautics and Space Administration, 2000). People argued
that NASA should pick any two from the three goals. Faster and cheaper
would not mean better. Better and cheaper would mean slower. Faster and
better would be more expensive. Such reduction, however, obscures the actual
reality facing operational personnel in safety-critical settings. These
people are there to pursue all three goals simultaneously—fine-tuning
their operation, as Starbuck and Milliken (1988) said, to "render it less redundant,
more efficient, more profitable, cheaper, or more versatile" (p.
323), fine-tuning, in other words, to make it faster, better, cheaper.
The 2003 Space Shuttle Columbia accident focused attention on the
maintenance work that was done on the Shuttle's external fuel tank, once
again revealing the differential pressures of having to be safe and getting
the job done (better, but also faster and cheaper). A mechanic working for
the contractor, whose task it was to apply the insulating foam to the exter-
PROCEDURES 145
nal fuel tank, testified that it took just a couple of weeks to learn how to get
the job done, thereby pleasing upper management and meeting production
schedules. An older worker soon showed him how he could mix the
base chemicals of the foam in a cup and brush it over scratches and gouges
in the insulation, without reporting the repair. The mechanic soon found
himself doing this hundreds of times, each time without filling out the required
paperwork. Scratches and gouges that were brushed over with the
mixture from the cup basically did not exist as far as the organization was
concerned. And those that did not exist could not hold up the production
schedule for the external fuel tanks. Inspectors often did not check. A company
program that once had paid workers hundreds of dollars for finding
defects had been watered down, virtually inverted by incentives for getting
the job done now.
Goal interactions are critical in such experiences, which contain all the ingredients
of procedural fluidity, maintenance pressure, the meaning of incidents
worth reporting, and their connections to drift into failure. As in most
operational work, the distance between formal, externally dictated logics of
action and actual work is bridged with the help of those who have been there
before, who have learned how to get the job done (without apparent safety
consequences), and who are proud to share their professional experience
with younger, newer workers. Actual practice by newcomers settles at a distance
from the formal description of the job. Deviance becomes routinized.
This is part of the vast informal networks characterizing much maintenance
work, including informal hierarchies of teachers and apprentices, informal
documentation of how to actually get work done, informal procedures and
tasks, and informal teaching practices. Inspectors did not check, did not
know, or did not report. Managers were happy that production schedules
were met and happy that fewer defects were being discovered—normal people
doing normal work in a normal organization. Or that is what it seemed to
everybody at the time. Once again, the notion of an incident, of something
that was worthy of reporting (a defect) got blurred against a background of
routine nonconformity. What was normal versus what was deviant was no
longer so clear. Goal conflicts between safer, better, and cheaper were reconciled
by doing the work more cheaply, superficially better (brushing over
gouges), and apparently without cost to safety. As long as orbiters kept coming
back safely, the contractor must have been doing something right. Understanding
the potential side effects was very difficult given the historical
mission success rate. Lack of failures were seen as a validation that current
strategies to prevent hazards were sufficient. Could anyone foresee, in a vastly
complex system, how local actions as trivial as brushing chemicals from a cup
could one day align with other factors to push the system over the edge? Recall
from chapter 2: What cannot be believed cannot be seen. Past success
was taken as guarantee of continued safety.
146 CHAPTER 7
The Internalization of External Pressure
Some organizations pass on their goal conflicts to individual practitioners
quite openly. Some airlines, for example, pay their crews a bonus for ontime
performance. An aviation publication commented on one of those operators
(a new airline called Excel, flying from England to holiday destinations):
"As part of its punctuality drive, Excel has introduced a bonus
scheme to give employees a bonus should they reach the agreed target for
the year. The aim of this is to focus everyone's attention on keeping the aircraft
on schedule" (Airliner World, 2001, p. 79). Such plain acknowledgment
of goal priorities, however, is not common. Most important goal conflicts
are never made so explicit, arising rather from multiple irreconcilable
directives from different levels and sources, from subtle and tacit pressures,
from management or customer reactions to particular trade-offs. Organizations
often resort to "conceptual integration, or plainly put, doublespeak"
(Dorner, 1989, p. 68). For example, the operating manual of another airline
opens by stating that "(1) our flights shall be safe; (2) our flights shall
be punctual; (3) our customers will find value for money." Conceptually,
this is Dorner's (1989) doublespeak, documentary integration of incompatibles.
It is impossible, in principle, to do all three simultaneously, as with
NASA's faster, better, cheaper. Whereas incompatible goals arise at the
level of an organization and its interaction with its environment, the actual
managing of goal conflicts under uncertainty gets pushed down into local
operating units—control rooms, cockpits, and the like. There the conflicts
are to be negotiated and resolved in the form of thousands of little and
larger daily decisions and trade-offs. These are no longer decisions and
trade-offs made by the organization, but by individual operators or crews. It
is this insidious delegation, this hand-over, where the internalization of external
pressure takes place. Crews of one airline describe their ability to negotiate
these multiple goals while under the pressure of limited resources
as "the blue feeling" (referring to the dominant color of their fleet). This
feeling represents the willingness and ability to put in the work to actually
deliver on all three goals simultaneously (safety, punctuality, and value for
money). This would confirm that practitioners do pursue incompatible
goals of faster, better, and cheaper all at the same time and are aware of it
too. In fact, practitioners take their ability to reconcile the irreconcilable as
a source of considerable professional pride. It is seen as a strong sign of
their expertise and competence.
The internalization of external pressure, this usurpation of organizational
goal conflicts by individual crews or operators, is not well described
or modeled yet. This, again, is a question about the dynamics of the
macro-micro connection that we saw in chapter 2. How is it that a global
tension between efficiency and safety seeps into local decisions and trade-
PROCEDURES 147
offs by individual people or groups? These macrostructural forces, which
operate on an entire company, find their most prominent expression in
how local work groups make assessments about opportunities and risks
(see also Vaughan, 1996). Institutional pressures are reproduced, or perhaps
really manifested, in what individual people do, not by the organization
as a whole. But how does this connection work? Where do external
pressures become internal? When do the problems and interests of an organization
under pressure of resource scarcity and competition become
the problems and interests of individual actors at several levels within that
organization?
The connection between external pressure and its internalization is relatively
easy to demonstrate when an organization explicitly advertises how
operators' pursuit of one goal will lead to individual rewards (a bonus
scheme to keep everybody focused on the priority of schedule). But such
cases are probably rare, and it is doubtful whether they represent actual internalization
of a goal conflict. It becomes more difficult when the connection
and the conflicts are more deeply buried in how operators transpose
global organizational aims onto individual decisions. For example, the blue
feeling signals aircrews' strong identification with their organization
(which flies blue aircraft) and what it and its brand stand for (safety, reliability,
value for money). Yet it is a feeling that only individuals or crews can
have, a feeling because it is internalized. Insiders point out how some crews
or commanders have the blue feeling whereas others do not. It is a personal
attribute, not an organizational property. Those who do not have the blue
feeling are marked by their peers—seldom supervisors—for their insensitivity
to, or disinterest in, the multiplicity of goals, for their unwillingness to
do substantive cognitive work necessary to reconcile the irreconcilable.
These practitioners do not reflect the corps' professional pride because
they will always make the easiest goal win over the others (e.g., "Don't worry
about customer service or capacity utilization, it's not my job"), choosing
the path of least resistance and least work in the eyes of their peers. In the
same airline, those who try to adhere to minute rules and regulations are
called "Operating Manual worshippers"—a clear signal that their way of
dealing with goal contradictions is not only perceived as cognitively cheap
(just go back to the book, it will tell you what to do), but as hampering the
collective ability to actually get the job done, diluting the blue feeling. The
blue feeling, then, is also not just a personal attribute, but an interpeer
commodity that affords comparisons, categorizations, and competition
among members of the peer group, independent of other layers or levels in
the organization. Similar interpeer pride and perception operate as subtle
engine behind the negotiation among different goals in other professions
too, for example flight dispatchers, air-traffic controllers, or aircraft maintenance
workers (McDonald et al., 2002).
148 CHAPTER 7
The latter group (aircraft maintenance) has incorporated even more internal
mechanisms to deal with goal interactions. The demand to meet
technical requirements clashes routinely with time or other resource constraints
such as inadequate time, personnel, tools, parts, or functional work
environment (McDonald et al., 2002). The vast internal, sub-surface networks
of routines, illegal documentation, and shortcuts, which from the
outside would be seen as massive infringement of existing procedures, are a
result of the pressure to reconcile and compromise. Actual work practices
constitute the basis for technicians' strong professional pride and sense of
responsibility for delivering safe work that exceeds even technical requirements.
Seen from the inside, it is the role of the technician to apply judgment
founded on his or her knowledge, experience, and skill—not on formal
procedure. Those most adept at this are highly valued for their
productive capacity even by higher organizational levels. Yet upon formal
scrutiny (e.g., an accident inquiry), informal networks and practices often
retreat from view, yielding only a bare-bones version of work in which the
nature of goal compromises and informal activities is never explicit, acknowledged,
understood, or valued. Similar to the British Army on the
Somme, management in some maintenance organizations occasionally decides
(or pretends) that there is no local confusion, that there are no contradictions
or surprises. In their official understanding, there are rules and
people who follow the rules, and safe outcomes as a result. People who do
not follow the rules are more prone to causing accidents, as the hindsight
bias inevitably points out. To people on the work floor, in contrast, management
does not even understand the fluctuating pressures on their work, let
alone the strategies necessary to accommodate those (McDonald et al.).
Both cases (the blue feeling and maintenance work) challenge human
factors' traditional reading of violations as deviant behavior. Human factors
wants work to mirror prescriptive task analyses or rules, and violations
breach vertical control implemented through such managerial or design
directives. Seen from the inside of people's own work, however, violations
become compliant behavior. Cultural understandings (e.g., expressed in
notions of a blue feeling) affect interpretative work, so that even if people's
behavior is objectively deviant, they will see their own conduct as conforming
(Vaughan, 1999). Their behavior is compliant with the emerging, local,
internalized ways to accommodate multiple goals important to the organization
(maximizing capacity utilization but doing so safely, meeting technical
requirements, but also deadlines). It is compliant, also, with a complex
of peer pressures and professional expectations in which unofficial action
yields better, quicker ways to do the job; in which unofficial action is a sign
of competence and expertise; where unofficial action can override or outsmart
hierarchical control and compensate for higher level organizational
deficiencies or ignorance.
PROCEDURES 149
ROUTINE NONCONFORMITY
The gap between procedures and practice is not constant. After the creation
of new work (e.g., through the introduction of new technology), time
can go by before applied practice stabilizes, likely at a distance from the
rules as written for the system on the shelf. Social science has characterized
this migration from tightly coupled rules to more loosely coupled practice
variously as "fine-tuning" (Starbuck & Milliken, 1988) or "practical drift"
(Snook, 2000). Through this shift, applied practice becomes the pragmatic
imperative; it settles into a system as normative. Deviance (from the original
rules) becomes normalized; nonconformity becomes routine (Vaughan,
1996). The literature has identified important ingredients in the normalization
of deviance, which can help organizations understand the nature of
the gap between procedures and practice:
• Rules that are overdesigned (written for tightly coupled situations, for
the worst case) do not match actual work most of the time. In real work,
there is slack: time to recover, opportunity to reschedule and get the job
done better or more smartly (Starbuck & Milliken). This mismatch creates
an inherently unstable situation that generates pressure for change
(Snook).
• Emphasis on local efficiency or cost effectiveness pushes operational
people to achieve or prioritize one goal or a limited set of goals (e.g., customer
service, punctuality, capacity utilization). Such goals are typically easily
measurable (e.g., customer satisfaction, on-time performance), whereas
it is much more difficult to measure how much is borrowed from safety.
• Past success is taken as guarantee of future safety. Each operational success
achieved at incremental distances from the formal, original rules can establish
a new norm. From here a subsequent departure is once again only a
small incremental step (Vaughan). From the outside, such fine-tuning constitutes
incremental experimentation in uncontrolled settings (Starbuck &
Milliken)—on the inside, incremental nonconformity is an adaptive response
to scarce resources, multiple goals, and often competition.
• Departures from the routine become routine. Seen from the inside of
people's own work, violations become compliant behavior. They are compliant
with the emerging, local ways to accommodate multiple goals important
to the organization (maximizing capacity utilization but doing so
safely; meeting technical requirements, but also deadlines). They are compliant,
also, with a complex of peer pressures and professional expectations
in which unofficial action yields better, quicker ways to do the job; in which
unofficial action is a sign of competence and expertise; where unofficial action
can override or outsmart hierarchical control and compensate for
higher level organizational deficiencies or ignorance.
150 CHAPTER 7
Although a gap between procedures and practice always exists, there are
different interpretations of what this gap means and what to do about it. As
pointed out in chapter 6, human factors may see the gap between procedures
and practice as a sign of complacency—operators' self-satisfaction
with how safe their practice or their system is or a lack of discipline. Psychologists
may see routine nonconformity as expressing a fundamental tension
between multiple goals (production and safety) that pull workers in opposite
directions: getting the job done but also staying safe. Others highlight
the disconnect that exists between distant supervision or preparation of the
work (as laid down in formal rules) on the one hand, and local, situated action
on the other. Sociologists may see in the gap a political lever applied
on management by the work floor, overriding or outsmarting hierarchical
control and compensating for higher level organizational deficiencies or ignorance.
To the ethnographer, routine nonconformity would be interesting
not just because of what it says about the work or the work context, but
because of what it says about what the work means to the operator.
The distance between procedures and practice can create widely divergent
images of work. Is routine nonconformity an expression of elitist operators
who consider themselves to be above the law, of people who demonstrate
a willingness to ignore the rules? Work in that case is about individual
choices, supposedly informed choices between doing that work well or
badly, between following the rules or not. Or is routine nonconformity a
systematic by-product of the social organization of work, where it emerges
from the interactions between organizational environment (scarcity and
competition), internalized pressures, and the underspecified nature of
written guidance? In that case, work is seen as fundamentally contextualized,
constrained by environmental uncertainty and organizational characteristics,
and influenced only to a small extent by individual choice. People's
ability to balance these various pressures and influences on procedure
following depends in large part on their history and experience. And, as
Wright and McCarthy (2003) pointed out, there are currently very few ways
in which this experience can be given a legitimate voice in the design of
procedures.
As chapter 8 shows, a more common way of responding to what is seen as
human unreliability is to introduce more automation. Automation has no
trouble following algorithms. In fact, it could not run without any. Yet such
literalism can be a mixed blessing.
Chapter 8
Can We Automate Human
Error Out of the System?
If people cannot be counted on to follow procedures, should we not simply
marginalize human work? Can automation get rid of human unreliability
and error? Automation extends our capabilities in many, if not all, transportation
modes. In fact, automation is often presented and implemented
precisely because it helps systems and people perform better. It may even
make operational lives easier: reducing task load, increasing access to information,
helping the prioritization of attention, providing reminders, doing
work for us where we cannot. What about reducing human error? Many indeed
have the expectation that automation will help reduce human error.
Just look at some of the evidence: All kinds of transport achieve higher navigation
accuracy with satellite guidance; pilots are now able to circumvent
pitfalls such as thunderstorms, windshear, mountains, and collisions with
other aircraft; and situation awareness improves dramatically with the introduction
of moving map displays.
So with these benefits, can we automate human error out of the system?
The thought behind the question is simple. If we automate part of a task,
then the human does not carry out that part. And if the human does not
carry out that part, there is no possibility of human error. As a result of this
logic, there was a time (and in some quarters there perhaps still is) that automating
everything we technically could was considered the best idea. The
Air Transport Association of America (ATA) observed, for example, that
"during the 1970's and early 1980's . . . the concept of automating as much
as possible was considered appropriate" (ATA, 1989, p. 4). It would lead to
greater safety, greater capabilities, and other benefits.
151
152 CHAPTER 8
NEW CAPABILITIES, NEW COMPLEXITIES
But, really, can we automate human error out of the system? There are
problems. With new capabilities come new complexities. We cannot just automate
part of a task and assume that the human-machine relationship remains
unchanged. Though it may have shifted (with the human doing less
and the machine doing more), there is still an interface between humans
and technology. And the work that goes on at that interface has likely
changed drastically. Increasing automation transforms hands-on operators
into supervisory controllers, into managers of a suite of automated and
other human resources. With their new work come new vulnerabilities, new
error opportunities. With of new interfaces (from pointers to pictures, from
single parameter gauges to computer displays) come new pathways to human-
machine coordination breakdown. Transportation has witnessed the
transformation of work by automation first-hand, and documented its consequences
widely. Automation does not do away with what we typically call
human error, just as (or precisely because) it does not do away with human
work. There is still work to do for people. It is not that the same kinds of errors
occur in automated systems as in manual systems. Automation changes
the expression of expertise and error; it changes how people can perform
well and changes how their performance breaks down, if and when it does.
Automation also changes opportunities for error recovery (often not for
the better) and in many cases delays the visible consequences of errors.
New forms of coordination breakdowns and accidents have emerged as a
result.
Data Overload
Automation does not replace human work. Instead, it changes the work it is
designed to support. And with these changes come new burdens. Take system
monitoring, for example. There are concerns that automation can create
data overload. Rather than taking away cognitive burdens from people,
automation introduces new ones, creating new types of monitoring and
memory tasks. Because automation does so much, it also can show much
(and indeed, there is much to show). If there is much to show, data overload
can occur, especially in pressurized, high-workload, or unusual situations.
Our ability to make sense of all the data generated by automation
has not kept pace with systems' ability to collect, transmit, transform, and
present data.
But data overload is a pretty complex phenomenon, and there are different
ways of looking at it (see Woods, Patterson, & Roth, 2002). For example,
we can see it as a workload bottleneck problem. When people experience
data overload, it is because of fundamental limits in their internal
AUTOMATING HUMAN ERROR AWAY 153
information-processing capabilities. If this is the characterization, then the
solution lies in even more automation. More automation, after all, will take
work away from people. And taking work away will reduce workload.
One area where the workload-reduction solution to the data-overload
problem has been applied is in the design of warning systems. It is there
that fears of data overload are often most prominent. Incidents in aviation
and other transportation modes keep stressing the need for better support
of human problem solving during dynamic fault scenarios. People complain
of too much data, of illogical presentations, of warnings that interfere
with other work, of a lack of order, and of no rhyme or reason to the way in
which warnings are presented. Workload reduction during dynamic fault
management is so important because problem solvers in dynamic domains
need to diagnose malfunctions while maintaining process integrity. Not
only must failures be managed while keeping the process running (e.g.,
keeping the aircraft flying); their implications for the ability to keep the
process running in the first place need to be understood and acted on.
Keeping the process intact and diagnosing failures are interwoven cognitive
demands in which timely understanding and intervention are often crucial.
A fault in a dynamic processes typically produces a cascade of disturbances
or failures. Modern airliners and high-speed vessels have their systems
tightly packed together because there is not much room onboard.
Systems are also cross-linked in many intricate ways, with electronic interconnections
increasingly common as a result of automation and computerization.
This means that failures in one system quickly affect other systems,
perhaps even along nonfunctional propagation paths. Failure crossover
can occur simply because systems are located next to one another, not because
they have anything functional in common. This may defy operator
logic or knowledge. The status of single components or systems, then, may
not be that interesting for a operator. In fact, it may be highly confusing.
Rather, the operator must see, through a forest of seemingly disconnected
failures, the structure of the problem so that a solution or countermeasure
becomes evident. Also, given the dynamic process managed, which issue
should be addressed first? What are the postconditions of these failures for
the remainder of operations (i.e., what is still operational, how far can I go,
what do I need to reconfigure)? Is there any trend? Are there noteworthy
events and changes in the monitored process right now? Will any of this get
worse? These are the types of questions that are critical to answer in successful
dynamic fault management.
Current warning systems in commercial aircraft do not go far in answering
these questions, something that is confirmed by pilots' assessments of
these systems. For example, pilots comment on too much data, particularly
all kinds of secondary and tertiary failures, with no logical order, and primary
faults (root causes) that are rarely, if ever, highlighted. The represen-
154 CHAPTER 8
tation is limited to message lists, something that we know hampers operators'
visualization of the state of their system during dynamic failure
scenarios. Yet not all warning systems are the same. Current warning systems
show a range of automated support, from not doing much at all,
through prioritizing and sorting warnings, to doing something about the
failures, to doing most of the fault management and not showing much at
all anymore. Which works best? Is there any merit to seeing data overload as
a workload bottleneck problem, and do automated solutions help?
An example of a warning system that basically shows everything that goes
wrong inside an aircraft's systems, much in order of appearance, is that of
the Boeing 767. Messages are presented chronologically (which may mean
the primary fault appears somewhere in the middle or even at the bottom
of the list) and failure severity is coded through color. A warning system
that departs slightly from this baseline is for example the Saab 2000, which
sorts the warnings by inhibiting messages that do not require pilot actions.
It displays the remaining warnings chronologically. The primary fault (if
known) is placed at the top, however, and if a failure results in an automatic
system reconfiguration, then this is shown too. The result is a shorter list
than the Boeing's, with a primary fault at the top. Next as an example comes
the Airbus A320, which has a fully defined logic for warning-message prioritization.
Only one failure is shown at the time, together with immediate action
items required of the pilot. Subsystem information can be displayed on
demand. Primary faults are thus highlighted, together with guidance on
how to deal with them. Finally, there is the MD-11, which has the highest
degree of autonomy and can respond to failures without asking the pilot to
do so. The only exceptions are nonreversible actions (e.g., an engine shutdown).
For most failures, the system informs the pilot of system reconfiguration
and presents system status. In addition, the system recognizes combinations
of failures and gives a common name to these higher order failures
(e.g., Dual Generator).
As could be expected, response latency on the Boeing 767-type warning
system is longest (Singer & Dekker, 2000). It takes a while for pilots to sort
through the messages and figure out what to do. Interestingly, they also get
it wrong more often on this type of system. That is, they misdiagnose the
primary failure more often than on any of the other systems. A nonprioritized
list of chronological messages about failures seems to defeat even
the speed-accuracy trade-off: Longer dwell times on the display do not help
people get it right. This is because the production of speed and accuracy
are cognitive: Making sense of what is going wrong inside an aircraft's systems
is a demanding cognitive task, where problem representation has a
profound influence on people's ability to do it successfully (meaning fast
and correct). Modest performance gains (faster responses and fewer misdiagnoses)
can be seen on a system like that of the Saab 2000, but the
155 AUTOMATING HUMAN ERROR AWAY
Airbus A320 and MD-11 solutions to the workload bottleneck problem really
seem to pay off. Performance benefits really accrue with a system that
sorts through the failures, shows them selectively, and guides the pilot in
what to do next. In our study, pilots were quickest to identify the primary
fault in the failure scenario with such a system, and made no misdiagnoses
in assessing what it was (Singer & Dekker). Similarly, a warning system that
itself contains or counteracts many of the failures and shows mainly what is
left to the pilot seems to help people in quickly identifying the primary
fault.
These results, however, should not be seen as justification for simply automating
more of the failure-management task. Human performance difficulties
associated with high-automation participation in difficult or novel
circumstances are well known, such as brittle procedure following where
operators follow heuristic cues from the automation rather than actively
seeking and dealing with information related to the disturbance chain. Instead,
these results indicate how progress can be made by changing the representational
quality of warning systems altogether, not just by automating
more of the human task portion. If guidance is beneficial, and if knowing
what is left is useful, then the results of this study tell designers of warning
systems to shift to another view of referents (the thing in the process that
the symbol on the display refers to). Warning-system designers would have
to get away from relying on single systems and their status as referents to
show on the display, and move toward referents that fix on higher order
variables that carry more meaning relative to the dynamic fault-management
task. Referents could integrate current status with future predictions,
for example, or cut across single parameters and individual systems to reveal
the structure behind individual failures and show consequences in
terms that are operationally immediately meaningful (e.g., loss of pressure,
loss of thrust).
Another way of looking at data overload is as a clutter problem—there is
simply too much on the display for people to cope with. The solution to
data overload as a clutter problem is to remove stuff from the display. In
warning-system design, for example, this may result in guidelines that stress
how no more than a certain number of lines must be filled up on a warning
screen. Seeing data overload as clutter, however, is completely insensitive of
context. What seems clutter in one situation may be highly valuable, or
even crucial, in another situation. The crash of an Airbus A330 during a test
flight at the factory field in Toulouse, France in 1994 provides a good demonstration
of this (see Billings, 1997). The aircraft was on a certification test
flight to study various pitch-transition control laws and how they worked
during an engine failure at low altitude, in a lightweight aircraft with a rearward
center of gravity (CG). The flight crew included a highly experienced
test pilot, a copilot, a flight-test engineer, and three passengers. Given the
156 CHAPTER 8
lightweight and rearward CG, the aircraft got off the runway quickly and
easily and climbed rapidly, with a pitch angle of almost 25° nose-up. The autopilot
was engaged 6 seconds after takeoff. Immediately after a short
climb, the left engine was brought to idle power and one hydraulic system
was shut down in preparation for the flight test. Now the autopilot had to simultaneously
manage a very low speed, an extremely high angle of attack,
and asymmetrical engine thrust. After the captain disconnected the autopilot
(this was only 19 seconds after takeoff) and reduced power on the right
engine to regain control of the aircraft, even more airspeed was lost. The
aircraft stalled, lost altitude rapidly, and crashed 36 seconds after takeoff.
When the airplane reached a 25° pitch angle, autopilot and flightdirector
mode information were automatically removed from the primary
flight display in front of the pilots. This is a sort of declutter mode. It was
found that, because of the high rate of ascent, the autopilot had gone into
altitude-acquisition mode (called ALT* in the Airbus) shortly after takeoff.
In this mode there is no maximum pitch protection in the autoflight system
software (the nose can go as high as the autopilot commands it to go, until
the laws of aerodynamics intervene). In this case, at low speed, the autopilot
was still trying to acquire the altitude commanded (2,000 feet), pitching up
to it, and sacrificing airspeed in the process. But ALT* was not shown to the
pilots because of the declutter function. So the lack of pitch protection was
not announced, and may not have been known to them. Declutter has not
been a fruitful or successful way of trying to solve data overload (see Woods
et al., 2002), precisely because of the context problem. Reducing data elements
on one display calls for that knowledge to be represented or retrieved
elsewhere (people may need to pull it from memory instead), lest it
be altogether unavailable.
Merely seeing data overload as a workload or clutter problem is based on
false assumptions about how human perception and cognition work. Questions
about maximum human data-processing rates are misguided because
this maximum, if there is one at all, is highly dependent on many factors, including
people's experience, goals, history, and directed attention. As alluded
to earlier in the book, people are not passive recipients of observed
data; they are active participants in the intertwined processes of observation,
action, and sense making. People employ all kinds of strategies to help
manage data, and impose meaning on it. For example, they redistribute
cognitive work (to other people, to artifacts in the world), they rerepresent
problems themselves so that solutions or countermeasures become more
obvious. Clutter and workload characterizations treat data as a unitary input
phenomenon, but people are not interested in data, they are interested
in meaning. And what is meaningful in one situation may not be meaningful
in the next. Declutter functions are context insensitive, as are workloadreduction
measures. What is interesting, or meaningful, depends on con-
157 AUTOMATING HUMAN ERROR AWAY
text. This makes designing a warning or display system highly challenging.
How can a designer know what the interesting, meaningful or relevant
pieces of data will be in a particular context? This takes a deep understanding
of the work as it is done, and especially as it will be done once the new
technology has been implemented. Recent advances in cognitive work analysis
(Vicente, 1999) and cognitive task design (Hollnagel, 2003) presented
ways forward, and more is said about such envisioning of future work toward
the end of this chapter.
Adapting to Automation, Adapting the Automation
In addition to knowing what (automated) systems are doing, humans are
also required to provide the automation with data about the world. They
need to input things. In fact, one role for people in automated systems is to
bridge the context gap. Computers are dumb and dutiful: They will do what
they are programmed to do, but their access to context, to a wider environment,
is limited—limited, in fact, to what has been predesigned or preprogrammed
into them. They are literalist in how they work. This means that
people have to jump in to fill a gap: They have to bridge the gulf between
what the automation knows (or can know) and what really is happening or
relevant out there in the world. The automation, for example, will calculate
an optimal descent profile in order to save as much fuel as possible. But the
resulting descent may be too steep for crew (and passenger) taste, so pilots
program in an extra tailwind, tricking the computers into descending earlier
and eventually more shallow (because the tailwind is fictitious). The automation
does not know about this context (preference for certain descent
rates over others), so the human has to bridge the gap. Such tailoring of
tools is a very human thing to do: People will shape tools to fit the exact task
they must fulfill. But tailoring is not risk- or problem-free. It can create additional
memory burdens, impose cognitive load when people cannot afford
it, and open up new error opportunities and pathways to coordination
breakdowns between human and machine.
Automation changes the task for which it was designed. Automation,
though introducing new capabilities, can increase task demands and create
new complexities. Many of these effects are in fact unintended by the designers.
Also, many of these side effects remain buried in actual practice
and are hardly visible to those who only look for the successes of new machinery.
Operators who are responsible for (safe) outcomes of their work
are known to adapt technology so that it fits their actual task demands. Operators
are known to tailor their working strategies so as to insulate themselves
from the potential hazards associated with using the technology. This
means that the real effects of technology change can remain hidden beneath
a smooth layer of adaptive performance. Operational people will
158 CHAPTER 8
make it work, no matter how recalcitrant or ill suited to the domain the automation,
and its operating procedures, really may be. Of course, the occasional
breakthroughs in the form of surprising accidents provide a window
onto the real nature of automation and its operational consequences. But
such potential lessons quickly glide out of view under the pressure of the
fundamental surprise fallacy.
Apparently successful adaptation by people in automated systems, though
adaptation in unanticipated ways, can be seen elsewhere in how pilots deal
with automated cockpits. One important issue on high-tech flight decks is
knowing what mode the automation is in (this goes for other applications
such as ship's bridges too: Recall the Royal Majesty from chap. 5). Mode confusion
can lie at the root of automation surprises, with people thinking that
they told the automation to do one thing whereas it was actually doing another.
How do pilots keep track of modes in an automated cockpit? The formal
instrument for tracking and checking mode changes and status is the
FMA, or flight-mode annunciator, a small strip that displays contractions or
abbreviations of modes (e.g., Heading Select mode is shown as HDG or HDG
SEL) in various colors, depending on whether the mode is armed (i.e., about
to become engaged) or engaged. Most airline procedures require pilots to
call out the mode changes they see on the FMA.
One study monitored flight crews during a dozen return flights between
Amsterdam and London on a full flight simulator (Bjorklund, Alfredsson,
& Dekker, 2003). Where both pilots were looking and how long was measured
by EPOG (eye-point-of-gaze) equipment, which uses different kinds of
techniques ranging from laser beams to measuring and calibrating saccades,
or eye jumps that can track the exact focal point of a pilot's eyes in a
defined visual field (see Fig. 8.1).
Pilots do not look at the FMA much at all. And they talk even less about
it. Very few call-outs are made the way they should be (according to the procedures).
Yet this does not seem to have an effect on automation-mode
awareness, nor on the airplane's flight path. Without looking or talking,
most pilots apparently still know what is going on inside the automation. In
this one study, 521 mode changes occurred during the 12 flights. About
60% of these were pilot induced (i.e., because of the pilot changing a setting
in the automation), the rest were automation induced. Two out of five
mode changes were never visually verified (meaning neither pilot looked at
their FMA during 40% of all mode changes). The pilot flying checked a little
less than the pilots not flying, which could be a natural reflection of the
role division: Pilots who are flying the aircraft have other sources of flightrelated
data they need to look at, whereas the pilot not flying can oversee
the entire process, thereby engaging more often in checks of what the automation
modes are. There are also differences between captains and first officers
as well (even after you correct for pilot-flying vs. pilot-not-flying
159 AUTOMATING HUMAN ERROR AWAY
FIG. 8.1. Example of pilot EPOG (eye point of gaze) fixations on a primary
flight display (PFD) and map display in an automated cockpit. The top part
of the PFD is the flight-mode annunciator (FMA; Bjorklund et al., 2003).
roles). Captains visually verified the transitions in 72% of the cases, versus
47% for first officers. This may mirror the ultimate responsibility that captains
have for safety of flight, yet there was no expectation that this would
translate into such concrete differences in automation monitoring.
Amount of experience on automated aircraft types was ruled out as being
responsible for the difference.
Of 512 mode changes, 146 were called out. If that does not seem like
much, consider this: Only 32 mode changes (that is about 6%) were called
out after the pilot looked at the FMA. The remaining call-outs came either
before looking at the FMA, or without looking at the FMA at all. Such a disconnect
between seeing and saying suggests that there are other cues that
pilots use to establish what the automation is doing. The FMA does not
serve as a major trigger for getting pilots to call out modes. Two out of five
mode transitions on the FMA are never even seen by entire flight crews. In
contrast to instrument monitoring in nonglass-cockpit aircraft, monitoring
for mode transitions is based more on a pilot's mental model of the automation
(which drives expectations of where and when to look) and an understanding
of what the current situation calls for. Such models are often
incomplete and buggy and it is not surprising that many mode transitions
are neither visually nor verbally verified by flight crews.
At the same time, a substantial number of mode transitions are actually
anticipated correctly by flight crews. In those cases where pilots do call out a
mode change, four out of five visual identifications of those mode changes
are accompanied or preceded by a verbalization of their occurrence. This
suggests that there are multiple, underinvestigated resources that pilots rely
160 CHAPTER8
on for anticipating and tracking automation-mode behavior (including pilot
mental models). The FMA, designed as the main source of knowledge
about automation status, actually does not provide a lot of that knowledge.
It triggers a mere one out of five call-outs, and gets ignored altogether by
entire crews for a whole 40% of all mode transitions. Proposals for new regulations
are unfortunately taking shape around the same old display concepts.
For example, Joint Advisory Circular ACJ 25.1329 (Joint Aviation Authorities,
2003, p. 28) said that: "The transition from an armed mode to an
engaged mode should provide an additional attention-getting feature, such
as boxing and flashing on an electronic display (per AMJ25-11) for a suitable,
but brief, period (e.g., ten seconds) to assist in flight crew awareness."
But flight-mode annunciators are not at all attention getting, whether there
is boxing or flashing or not. Indeed, empirical data show (as it has before,
see Mumaw, Sarter, & Wickens, 2001) that the FMA does not "assist in flight
crew awareness" in any dominant or relevant way. If design really is to capture
crew's attention about automation status and behavior, it will have to
do radically better than annunciating abstruse codes in various hues and
boxing or flashing times.
The call-out procedure appears to be miscalibrated with respect to real
work in a real cockpit, because pilots basically do not follow formal verification
and call-out procedures at all. Forcing pilots to visually verify the FMA
first and then call out what they see bears no similarity to how actual work is
done, nor does it have much sensitivity to the conditions under which such
work occurs. Call-outs may well be the first task to go out the window when
workload goes up, which is also confirmed by this type of research. In addition
to the few formal call-outs that do occur, pilots communicate implicitly
and informally about mode changes. Implicit communication surrounding
altitude capture could for example be "Coming up to one-three-zero, (capture)"
(referring to flight level 130). There appear to be many different
strategies to support mode awareness, and very few of them actually overlap
with formal procedures for visual verification and call-outs. Even during the
12 flights of the Bjorklund et al. (2003) study, there were at least 18 different
strategies that mixed checks, timing, and participation. These strategies
seem to work as well as, or even better than, the official procedure, as crew
communications on the 12 flights revealed no automation surprises that
could be traced to a lack of mode awareness. Perhaps mode awareness does
not matter that much for safety after all.
There is an interesting experimental side effect here: If mode awareness
is measured mainly by visual verification and verbal call-outs, and crews neither
look nor talk, then are they unaware of modes, or are the researchers
unaware of pilots' awareness? This poses a puzzle: Crews who neither talk
nor look can still be aware of the mode their automation is in, and this, indeed
seems to be the case. But how, in that case, is the researcher (or your
161 AUTOMATING HUMAN ERROR AWAY
company, or line-check pilot) to know? The situation is one answer. By simply
looking at where the aircraft is going, and whether this overlaps with the
pilots' intentions, an observer can get to know something about apparent
pilot awareness. It will show whether pilots missed something or not. In the
research reported here however, pilots missed nothing: There were no unexpected
aircraft behaviors from their perspective (Bjorklund et al., 2003).
This can still mean that the crews were either not aware of the modes and it
did not matter, or they were aware but the research did not capture it. Both
may be true.
MABA-MABA OR ABRACADABRA
The diversity of experiences and research results from automated cockpits
shows that automation creates new capabilities and complexities in ways
that may be difficult to anticipate. People adapt to automation in many different
ways, many of which have little resemblance to formally established
procedures for interacting with the automation. Can automation, in a very
Cartesian, dualistic sense, replace human work, thereby reducing human
error? Or is there a more complex coevolution of people and technology?
Engineers and others involved in automation development are often led to
believe that there is a simple answer, and in fact a simple way of getting the
answer. MABA-MABA lists, or "Men-Are-Better-At, Machines-Are-Better-At"
lists have appeared over the decades in various guises. What these lists basically
do is try to enumerate the areas of machine and human strengths and
weaknesses, in order to provide engineers with some guidance on which
functions to automate and which ones to give to the human. The process of
function allocation as guided by such lists sounds straightforward, but is actually
fraught with difficulty and often unexamined assumptions.
One problem is that the level of granularity of functions to be considered
for function allocation is arbitrary. For example, it depends on the
model of information processing on which the MABA-MABA method is
based (Hollnagel, 1999). In Parasuraman, Sheridan, and Wickens (2000),
four stages of information processing (acquisition, analysis, selection, response)
form the guiding principle to which functions should be kept or
given away, but this is an essentially arbitrary decomposition based on a notion
of a human-machine ensemble that resembles a linear input-output
device. In cases where it is not a model of information processing that determines
the categories of functions to be swapped between human and
machine, the technology itself often determines it (Hollnagel, 1999).
MABA-MABA attributes are then cast in mechanistic terms, derived from
technological metaphors. For example, Fitts (1951) applied terms such as
information capacity and computation in his list of attributes for both the
162 CHAPTER8
human and the machine. If the technology gets to pick the battlefield (i.e.,
determine the language of attributes) it will win most of them back for itself.
This results in human-uncentered systems where typically heuristic and
adaptive human abilities such as not focusing on irrelevant data, scheduling
and reallocating activities to meet current constraints, anticipating events,
making generalizations and inferences, learning from past experience, and
collaborating (Hollnagel) easily fall by the wayside.
Moreover, MABA-MABA lists rely on a presumption of fixed human and
machine strengths and weaknesses. The idea is that, if you get rid of the
(human) weaknesses and capitalize on the (machine) strengths, you will
end up with a safer system. This is what Hollnagel (1999) called "function
allocation by substitution." The idea is that automation can be introduced
as a straightforward substitution of machines for people—preserving the
basic system while improving some of its output measures (lower workload,
better economy, fewer errors, higher accuracy, etc.). Indeed, Parasuraman
et al. (2000) recently defined automation in this sense: "Automation refers
to the full or partial replacement of a function previously carried out by the
human operator" (p. 287). But automation is more than replacement (although
perhaps automation is about replacement from the perspective of
the engineer). The really interesting issues from a human performance
standpoint emerge after such replacement has taken place.
Behind the idea of substitution lies the idea that people and computers
(or any other machines) have fixed strengths and weaknesses and that the
point of automation is to capitalize on the strengths while eliminating or
compensating for the weaknesses. The problem is that capitalizing on some
strength of computers does not replace a human weakness. It creates new
human strengths and weaknesses—often in unanticipated ways (Bainbridge,
1987). For instance, the automation strength to carry out long sequences
of action in predetermined ways without performance degradation
amplifies classic human vigilance problems. It also exacerbates the
system's reliance on the human strength to deal with the parametrization
problem, or literalism (automation does not have access to all relevant
world parameters for accurate problem solving in all possible contexts). As
we have seen, however, human efforts to deal with automation literalism, by
bridging the context gap, may be difficult because computer systems can be
hard to direct (How do I get it to understand? How do I get it to do what I
want?). In addition, allocating a particular function does not absorb this
function into the system without further consequences. It creates new functions
for the other partner in the human-machine equation—functions
that did not exist before, for example, typing, or searching for the right display
page, or remembering entry codes. The quest for a priori function allocation,
in other words, is intractable (Hollnagel & Woods, 1983), and not
163 AUTOMATING HUMAN ERROR AWAY
only this: Such new kinds of work create new error opportunities (What was
that code again? Why can't I find the right page?).
TRANSFORMATION AND ADAPTATION
Automation produces qualitative shifts. Automating something is not just a
matter of changing a single variable in an otherwise stable system (Woods &
Dekker, 2001). Automation transforms people's practice and forces them
to adapt in novel ways: "It alters what is already going on—the everyday
practices and concerns of a community of people—and leads to a resettling
into new practices" (Flores, Graves, Hartfield, & Winograd, 1988, p. 154).
Unanticipated consequences are the result of these much more profound,
qualitative shifts. For example, during the Gulf War in the early 1990s, "almost
without exception, technology did not meet the goal of unencumbering
the personnel operating the equipment. Systems often required exceptional
human expertise, commitment, and endurance" (Cordesman &
Wagner, 1996, p. 25).
Where automation is introduced, new human roles emerge. Engineers,
given their professional focus, may believe that automation transforms the
tools available to people, who will then have to adapt to these new tools. In
chapter 9 we see how, according to some researchers, the removal of paper
flight-progress strips in air-traffic control represents a transformation of the
workplace, to which controllers only need to adapt (they will compensate
for the lack of flight progress strips). In reality, however, people's practice
gets transformed by the introduction of new tools. New technology, in turn,
gets adapted by people in locally pragmatic ways so that it will fit the constraints
and demands of actual practice. For example, controlling without
flight-progress strips (relying more on the indications presented on the radar
screen) asks controllers to develop and refine new ways of managing
airspace complexity and dynamics. In other words, it is not the technology
that gets transformed and the people who adapt. Rather, people's practice
gets transformed and they in turn adapt the technology to fit their local demands
and constraints.
The key is to accept that automation will transform people's practice and
to be prepared to learn from these transformations as they happen. This is
by now a common (but not often successful) starting point in contextual
design. Here the main focus of system design is not the creation of artifacts
per se, but getting to understand the nature of human practice in a particular
domain, and changing those work practices rather than just adding new
technology or replacing human work with machine work. This recognizes
that:
164 CHAPTER 8
• Design concepts represent hypotheses or beliefs about the relationship
between technology and human cognition and collaboration.
• They need to subject these beliefs to empirical jeopardy by a search for
disconfirming and confirming evidence.
• These beliefs about what would be useful have to be tentative and open
to revision as they learn more about the mutual shaping that goes on
between artifacts and actors in a field of practice.
Subjecting design concepts to such scrutiny can be difficult. Traditional validation
and verification techniques applied to design prototypes may turn
up nothing, but not necessarily because there is nothing that could turn up.
Validation and verification studies typically try to capture small, narrow outcomes
by subjecting a limited version of a system to a limited test. The results
can be informative, but hardly about the processes of transformation
(different work, new cognitive and coordination demands) and adaptation
(novel work strategies, tailoring of the technology) that will determine the
sources of a system's success and potential for failure once it has been
fielded. Another problem is that validation and verification studies need a
reasonably ready design in order to carry any meaning. This presents a dilemma:
By the time results are available, so much commitment (financial,
psychological, organizational, political) has been sunk into the particular
design that any changes quickly become unfeasible.
Such constraints through commitment can be avoided if human factors
can say meaningful things early on in a design process. What if the system
of interest has not been designed or fielded yet? Are there ways in which
we can anticipate whether automation, and the human role changes it implies,
will create new error problems rather than simply solving old ones?
This has been described as Newell's catch: In order for human factors to
say meaningful things about a new design, the design needs to be all but
finished. Although data can then be generated, they are no longer of use,
because the design is basically locked. No changes as a result of the insight
created by human factors data are possible anymore. Are there ways
around this catch? Can human factors say meaningful things about a design
that is nowhere near finished? One way that has been developed is future
incident studies, and the concept they have been tested on is exception
management.
AUTOMATION AND EXCEPTION MANAGEMENT
One role that may fit the human well is that of exception manager. Introducing
automation to turn people into exception managers can sound like
a good idea. In ever busier systems, where operators are vulnerable to prob-
165 AUTOMATING HUMAN ERROR AWAY
lems of data overload, turning humans into exception managers is a powerfully
attractive concept. It has, for example, been practiced in the dark
cockpit design that essentially keeps the human operator out of the loop
(all the annunciator lights are out in normal operating conditions) until
something interesting happens, which may then be the time for the human
to intervene. This same envisioned role, of exception manager, dominates
recent ideas about how to effectively let humans control ever increasing airtraffic
loads. Perhaps, the thought goes, controllers should no longer be in
charge of all the parameters of every flight in their sector. A core argument
is that the human controller is a limiting factor in traffic growth. Too many
aircraft under one single controller leads to memory overload and the risk
of human error. Decoupling controllers from all individual flights in their
sectors, through greater computerization and automation on the ground
and greater autonomy in the air, is assumed to be the way around this limit.
The reason we may think that human controllers will make good exception
managers is that humans can handle the unpredictable situations that
machines cannot. In fact, this is often a reason why humans are still to be
found in automated systems in the first place (see Bainbridge, 1987). Following
this logic, controllers would be very useful in the role of traffic manager,
waiting for problems to occur in a kind of standby mode. The view of
controller practice is one of passive observer, ready to act when necessary.
But intervening effectively from a position of disinvolvement has proven
to be difficult—particularly in air-traffic control. For example, Endsley,
Mogford, Allendoerfer, and Stein (1997) pointed out, in a study of direct
routings that allowed aircraft deviations without negotiations, that with
more freedom of action being granted to individual aircraft, it became
more difficult for controllers to keep up with traffic. Controllers were less
able to predict how traffic patterns would evolve over a foreseeable
timeframe. In other studies too, passive monitors of traffic seemed to have
trouble maintaining a sufficient understanding of the traffic under their
control (Galster, Duley, Masolanis, & Parasuraman, 1999), and were more
likely to overlook separation infringements (Metzger & Parasuraman,
1999). In one study, controllers effectively gave up control over an aircraft
with communication problems, leaving it to other aircraft and their collision-
avoidance systems to sort it out among themselves (Dekker & Woods,
1999). This turned out to be the controllers' only route out of a fundamental
double bind: If they intervened early they would create a lot of workload
problems for themselves (suddenly a large number of previously autonomous
aircraft would be under their control). Yet if they waited on intervention
(in order to gather more evidence on the aircraft's intentions), they
would also end up with an unmanageable workload and very little time to
solve anything in. Controller disinvolvement can create more work rather
than less, and produce a greater error potential.
166 CHAPTER 8
This brings out one problem of envisioning practice, of anticipating how
automation will create new human roles and what the performance consequences
of those roles will be. Just saying "manager of exceptions" is insufficient:
It does not make explicit what it means to practice. What work does
an exception manager do? What cues does he or she base decisions on?
The downside of underspecification is the risk of remaining trapped in a
disconnected, shallow, unrealistic view of work. And when our view of (future)
practice is disconnected from many of the pressures, challenges, and
constraints operating in that world, our view of practice is distorted from
the beginning. It misses how operational people's strategies are often intricately
adapted to deal effectively with these constraints and pressures.
There is an upside to underspecification, however, and that is the freedom
to explore new possibilities and new ways to relax and recombine the
multiple constraints, all in order to innovate and improve. Will automation
help you get rid of human error? With air-traffic controllers as exception
managers, it is interesting to think about how the various designable objects
would be able to support them in exception management. For example, visions
of future air-traffic control systems typically include data linking as an
advance that avoids the narrow bandwidth problem of voice communications
—thus enhancing system capacity. In one study (Dekker & Woods,
1999), a communications failure affected an aircraft that had also suffered
problems with its altitude reporting (equipment that tells controllers how
high it is and whether it is climbing or descending). At the same time, this
aircraft was headed for streams of crossing air traffic. Nobody knew exactly
how datalink, another piece of technology not connected to altitudeencoding
equipment, would be implemented (its envisioned use was, and
is, to an extent underspecified). One controller, involved in the study, had
the freedom to suggest that air-traffic control should contact the airline's
dispatch or maintenance office to see whether the aircraft was climbing or
descending or level. After all, data link could be used by maintenance and
dispatch personnel to monitor the operational and mechanical status of an
aircraft, so "if dispatch monitors power settings, they could tell us," the controller
suggested. Others objected because of the coordination overheads
this would create. The ensuing discussion showed that, in thinking about
future systems and their consequences for human error, we can capitalize
on underspecification if we look for the so-called leverage points (in the example:
data link and other resources in the system) and a sensitivity to the
fact that envisioned objects only become tools through use—imagined or
real (data links to dispatch become a backup air-traffic control tool).
Anticipating the consequences of automation on human roles is also difficult
because—without a concrete system to test—there are always multiple
versions of how the proposed changes will affect the field of practice in the
future. Different stakeholders (in air-traffic control this would be air carri-
167 AUTOMATING HUMAN ERROR AWAY
ers, pilots, dispatchers, air-traffic controllers, supervisors, flow controllers)
have different perspectives on the impact of new technology on the nature
of practice. The downside of this plurality is a kind of parochialism where
people mistake their partial, narrow view for the dominant view of the future
of practice, and are unaware of the plurality of views across stakeholders.
For example, one pilot claimed that greater autonomy for airspace
users is "safe, period" (Baiada, 1995). The upside of plurality is the triangulation
that is possible when the multiple views are brought together. In examining
the relationships, overlaps, and gaps across multiple perspectives,
we are better able to cope with the inherent uncertainty built into looking
into the future.
A number of future incident studies (see Dekker & Woods, 1999) examined
controllers' anomaly response in future air-traffic control worlds precisely
by capitalizing on this plurality. To study anomaly response under envisioned
conditions, groups of practitioners (controllers, pilots, and
dispatchers) were trained on proposed future rules. They were brought together
to try to apply these rules in solving difficult future airspace problems
that were presented to them in several scenarios. These included aircraft
decompression and emergency descents, clear air turbulence, frontal
thunderstorms, corner-post overloading (too many aircraft going to one
entry point for airport area), and priority air-to-air refueling and consequent
airspace restrictions and communication failures. These challenges,
interestingly, were largely rule or technology independent: They can happen
in airspace systems of any generation. The point was not to test the
anomaly response performance of one group against that of another, but to
use triangulation of multiple stakeholder viewpoints—anchored in the task
details of a concrete problem—to discover where the envisioned system
would crack, where it would break down. Validity in such studies derives
from: (a) the extent to which problems to be solved in the test situation represent
the vulnerabilities and challenges that exist in the target world, and
(b) the way in which real problem-solving expertise is brought to bear by
the study participants.
Developers of future air-traffic control architectures have been envisioning
a number of predefined situations that call for controller intervention,
a kind of reasoning that is typical for engineering-driven decisions about
automated systems. In air-traffic management, for example, potentially
dangerous aircraft maneuvers, local traffic density (which would require
some density index), or other conditions that compromise safety would
make it necessary for a controller to intervene. Such rules, however, do not
reduce uncertainty about whether to intervene. They are all a form of
threshold crossing—intervention is called for when a certain dynamic density
has been reached or a number of separation miles has been transgressed.
But threshold-crossing alarms are very hard to get right—they
168 CHAPTER 8
come either too early or too late. If too early, a controller will lose interest
in them: The alarm will be deemed alarmist. If the alarm comes too late, its
contribution to flagging or solving the problem will be useless and it will be
deemed incompetent. The way in which problems in complex, dynamic
worlds grow and escalate, and the nature of collaborative interactions, indicate
that recognizing exceptions in how others (either machines or people)
are handling anomalies is complex. The disappointing history of automating
problem diagnosis inspires little further hope. Threshold-crossing
alarms cannot make up for a disinvolvement—they can only make a controller
acutely aware of those situations in which it would have been nice to
have been involved from the start.
Future incident studies allow us to extend the empirical and theoretical
base on automation and human performance. For example, supervisorycontrol
literature makes no distinction between anomalies and exceptions.
This indistinction results from the source of supervisory-control work: How
do people control processes over physical distances (time lag, lack of access,
etc.). However, air-traffic control augments the issue of supervisory
control with a cognitive distance: Airspace participants have some system
knowledge and operational perspective, as do controllers, but there are
only partial overlaps and many gaps. Studies on exception management in
future air-traffic control force us to make a distinction between anomalies
in the process, and exceptions from the point of view of the supervisor
(controller). Exceptions can arise in cases where airspace participants are
dealing with anomalies (e.g., an aircraft with pressurization or communications
problems) in a way that forces the controller to intervene. An exception
is a judgement about how well others are handling or going to handle
disturbances in the process. Are airspace participants handling things well?
Are they going to get themselves in trouble in the future? Judging whether
airspace users are going to get in trouble in their dealings with a process disturbance
would require a controller to recognize and trace a situation over
time—contradicting arguments that human controllers make good standby
interveners.
Will Only the Predicted Consequences Occur?
In developing new systems, it is easy for us to become miscalibrated. It is
easy for us to become overconfident that if our envisioned system can be
realized, the predicted consequences and only the predicted consequence
will occur. We lose sight of the fact that our views of the future are
tentative hypotheses and that we would actually need to remain open to
revision, that we need to continually subject these hypotheses to empirical
jeopardy.
169 AUTOMATING HUMAN ERROR AWAY
One way to fool ourselves into thinking that only the predicted consequences
will occur when we introduce automation is to stick with substitutional
practice of function allocation. Substitution assumes a fundamentally
uncooperative system architecture in which the interface between
human and machine has been reduced to a straightforward "you do this, I
do that" trade. If that is what it is, of course we should be able to predict the
consequences. But it is not that simple. The question for successful automation
is not who has control over what or how much. That only looks at the
first parts, the engineering parts. We need to look beyond this and start asking
humans and automation the question: "How do we get along together?"
Indeed, where we really need guidance today is in how to support the coordination
between people and automation. In complex, dynamic, nondeterministic
worlds, people will continue to be involved in the operation of
highly automated systems. The key to a successful future of these systems
lies in how they support cooperation with their human operators—not only
in foreseeable standard situations, but also under novel, unexpected circumstances.
One way to frame the question is how to turn automated systems into effective
team players (Sarter & Woods, 1997). Good team players make their
activities observable to fellow team players, and are easy to direct. To be observable,
automation activities should be presented in ways that capitalize
on well-documented human strengths (our perceptual system's acuity to
contrast, change and events, our ability to recognize patterns and know
how to act on the basis of this recognition, e.g., Klein). For example:
• Event based: Representations need to highlight changes and events in
ways that the current generation of state-oriented displays do not.
• Future oriented: In addition to historical information, human operators
in dynamic systems need support for anticipating changes and
knowing what to expect and where to look next.
• Pattern based: Operators must be able to quickly scan displays and pick
up possible abnormalities without having to engage in difficult cognitive
work (calculations, integrations, extrapolations of disparate pieces
of data). By relying on pattern- or form-based representations, automation
has an enormous potential to convert arduous mental tasks into
straightforward perceptual ones.
Team players are directable when the human operator can easily and efficiently
tell them what to do. Designers could borrow inspiration from how
practitioners successfully direct other practitioners to take over work.
These are intermediate, cooperative modes of system operation that allow
170 CHAPTER 8
human supervisors to delegate suitable subproblems to the automation,
just as they would be delegated to human crew members. The point is not
to make automation into a passive adjunct to the human operator who then
needs to micromanage the system each step of the way. This would be a
waste of resources, both human and machine. Human operators must be allowed
to preserve their strategic role in managing system resources as they
see fit, given the circumstances.
Chapter 9
Will the System Be Safe?
How do you know whether a new system will be safe? As chapter 8 showed,
automating parts of human work may make a system safer, but they may
not. The Alaska Airlines 261 accident discussed in chapter 2 illustrates how
difficult it is to know whether a system is going to be safe during its operational
lifetime. In the case of the DC-9 trim system, bridging the gap between
producing a system and running it proved quite difficult. Certifying
that the system was safe, or airworthy, when it rolled out of the factory with
zero flying hours was one thing. Certifying that it would stay safe during a
projected lifetime proved to be quite another. Alaska 261 shows how large
the gulf between making a system and maintaining it can be.
The same is true for sociotechnical systems. Take the issue of flightprogress
strips in air-traffic control. The flight strip is a small paper slip
with flight-plan data about each controlled aircraft's route, speed, altitude,
times over waypoints, and other characteristics (see Fig. 9.1). It is
used by air-traffic controllers in conjunction with a radar representation
of air traffic. A number of control centers around the world are doing
away with these paper strips, to replace them with automated flighttracking
systems. Each of these efforts requires, in principle, a rigorous
certification process. Different teams of people look at color coding, letter
size and legibility, issues of human-computer interaction, software reliability
and stability, seating arrangements, button sensitivities, and so
forth, and can spend a decade following the footsteps of a design process
to probe and poke it with methods and forms and questionnaires and tests
and checklists and tools and guidelines—all in an effort to ensure that local
human factors or ergonomics standards have been met. But such static
171
172 CHAPTER 9
SKA 9337 7351 320 310 OSY FUR TEH
1936 1946 1952
FIG. 9.1. Example of a flight strip. From left to right it shows the airplane's
flight number and transponder code, the entry altitude of the aircraft into
the controller's sector (FL320), the exit level (FL310), and what times it is expected
to fly across particular waypoints along its route in the controller's secsnapshots
may mean little. A lineup of microcertificates of usability does
not guarantee safety. As soon as they hit the field of practice, systems start
to drift. A year (or a month) after its inception, no sociotechnical system is
the same as it was in the beginning. As soon as a new technology is introduced,
the human, operational, organizational system that is supposed to
make the technology work forces it into locally practical adaptations. Practices
(procedures, rules) adapt around the new technology, and the technology
in turn is reworked, revised, and amended in response to the
emergence of practical experience.
THE LIMITS OF SAFETY CERTIFICATION
System safety is more than the sum of the certified parts. A redundant
torque tube inside of a jackscrew, for example, does nothing to maintain
the integrity of a DC-9 trim system without a maintenance program that
guarantees continued operability. But ensuring the existence of such a
maintenance system is nothing like understanding how the local rationality
of such a system can be sustained (we're doing the right thing, the safe
thing) while safety standards are in fact continually being eroded (e.g.,
from 350- to 2,550-hour lubrication interval). The redundant components
may have been built and certified. The maintenance program (with 2,550-
hour lubrication intervals—certified) may be in place. But safe parts do not
guarantee system safety.
Certification processes do not typically take lifetime wear of parts into account
when judging an aircraft airworthy, even if such wear will render an
aircraft, like Alaska 261, quite unworthy of flying. Certification processes
certainly do not know how to take sociotechnical adaptation of new equipment,
and the consequent potential for drift into failure, into account
when looking at nascent technologies. Systemic adaptation or wear is not a
criterion in certification decisions, nor is there a requirement to put in
place an organization to prevent or cover for anticipated wear rates or pragmatic
adaptation, or fine-tuning. As a certification engineer from the regu-
173 WILL THE SYSTEM BE SAFE?
lator testified, "Wear is not considered as a mode of failure for either a system
safety analysis or for structural considerations" (NTSB, 2002, p. 24).
Because how do you take wear into account? How can you even predict with
any accuracy how much wear will occur? McDonnell-Douglas surely had it
wrong when it anticipated wear rates on the trim jackscrew assembly of its
DC-9. Originally, the assembly was designed for a service life of 30,000 flight
hours without any periodic inspections for wear. But within a year, excessive
wear had been discovered nonetheless, prompting a reconsideration.
The problem of certifying a system as safe to use can become even more
complicated if the system to be certified is sociotechnical and thereby even
less calculable. What does wear mean when the system is sociotechnical
rather than consisting of pieces of hardware? In both cases, safety certification
should be a lifetime effort, not a still assessment of decomposed system
status at the dawn of a nascent technology. Safety certification should be
sensitive to the coevolution of technology and its use, its adaptation. Using
the growing knowledge base on technology and organizational failure,
safety certification could aim for a better understanding of the ecology in
which technology is released—the pressures, resource constraints, uncertainties,
emerging uses, fine-tuning, and indeed lifetime wear.
Safety certification is not just about seeing whether components meet
criteria, even if that is what it often practically boils down to. Safety certification
is about anticipating the future. Safety certification is about bridging
the gap between a piece of gleaming new technology in the hand now, and
its adapted, coevolved, grimy, greased-down wear and use further down the
line. But we are not very good at anticipating the future. Certification practices
and techniques oriented toward assessing the standard of current
components do not translate well into understanding total system behavior
in the future. Making claims about the future, then, often hangs on things
other than proving the worthiness of individual parts. Take the trim system
of the DC-9 again.
The jackscrew in the trim assembly had been classified as a "structure" in
the 1960s, leading to different certification requirements from when it
would have been seen as a system. The same piece of hardware, in other
words, could be looked at as two entirely different things: a system, or a
structure. In being judged a structure, it did not have to undergo the required
system safety analysis (which may, in the end, still not have picked
up on the problem of wear and the risks it implied). The distinction, this
partition of a single piece of hardware into different lexical labels, however,
shows that airworthiness is not a rational product of engineering calculation.
Certification can have much more to do with localized engineering
judgments, with argument and persuasion, with discourse and renaming,
with the translation of numbers into opinion, and opinion into numbers—
all of it based on uncertain knowledge.
174 CHAPTER9
As a result, airworthiness is an artificially binary black-or-white verdict (a
jet is either airworthy or it is not) that gets imposed on a very grey, vague,
uncertain world—a world where the effects of releasing a new technology
into actual operational life are surprisingly unpredictable and incalculable.
Dichotomous, hard yes or no meets squishy reality and never quite gets a
genuine grip. A jet that was judged airworthy, or certified as safe, may or
may not be in actual fact. It may be a little bit unairworthy. Is it still airworthy
with an end-play check of .0042 inches, the set limit? But "set" on the basis
of what? Engineering judgment? Argument? Best guess? Calculations?
What if a following end-play check is more favorable? The end-play check itself
is not very reliable. The jet may be airworthy today, but no longer tomorrow
(when the jackscrew snaps). But who would know?
The pursuit of answers to such questions can precede or accompany certification
efforts. Research, that putatively objective scientific encounter
with empirical reality, can assist in the creation of knowledge about the future,
as shown in chapter 8. So what about working without paper flight
strips? The research community has come to no consensus on whether air
traffic control can actually do without them, and if it does, how it succeeds
in keeping air-traffic under control. Research results are inconclusive.
Some literature suggested that flight strips are expendable without consequences
for safety (e.g., Albright, Truitt, Barile, Vortac, & Manning, 1996),
whereas others argued that air-traffic control is basically impossible without
them (e.g., Hughes, Randall, & Shapiro, 1993). Certification guidance that
could be extracted from the research base can go either way: It is either safe
or unsafe to do away with the flight strips, depending on whom you listen
to. What matters most for credibility is whether the researcher can make
statements about human work that a certifier can apply to the coming, future
use of a system. In this, researchers appeared to rely on argument and
rhetoric, as much as on method, to justify that the results they found are applicable
to the future.
LEIPZIG AS LEGITIMATE
For human factors, the traditionally legitimate way of verifying the safety of
new technology is to conduct experiments in the laboratory. Say that researchers
want to test whether operators can safely use voice-input systems,
or whether their interpretation of some target is better on three-dimensional
displays. The typical strategy is to build microversions of the future
system and expose a limited number of participants to various conditions,
some or all of which may contain partial representations of a target system.
Through its controlled settings, laboratory research already makes some
sort of verifiable step into the future. Empirical contact with a world to be
175 WILL THE SYSTEM BE SAFE?
designed is ensured because some version of that future world has been
prefabricated in the lab. This also leads to problems. Experimental steps
into the future are necessarily narrow, which affects the generalizability of
research findings. The mapping between test and target situations may miss
several important factors.
In part as a result of a restricted integration of context, laboratory studies
can yield divergent and eventually inconclusive results. Laboratory research
on decision making (Sanders & McCormick, 1997), for example, has found
several biases in how decision makers deal with information presented to
them. Can new technology circumvent the detrimental aspects of such biases,
which, according to some views, would lead to human error and safety
problems? One bias is that humans are generally conservative and do not
extract as much information from sources as they optimally should. Another
bias, derived from the same experimental research, is that people
have a tendency to seek far more information than they can absorb adequately.
Such biases would seem to be in direct opposition to each other. It
means that reliable predictions of human performance in a future system
may be difficult to make on the basis of such research. Indeed, laboratory
findings often come with qualifying labels that limit their applicability.
Sanders and McCormick (1997), for example, advised: "When interpreting
the . . . findings and conclusions, keep in mind that much of the literature
is comprised of laboratory studies using young, healthy males doing relatively
unmotivating tasks. The extent to which we can generalize to the general
working population is open to question" (p. 572).
Whether the question remains open does not seem to matter. Experimental
human factors research in the laboratory holds a special appeal because
it makes mind measurable, and it even allows mathematics to be applied
to the results. Quantitativism is good: It helps equate psychology with
natural science, shielding it from the unreliable wanderings through mental
life using dubious methods like introspection. The large-scale university
laboratories that are now a mainstay of many human factors departments
were a 19th-century European invention, pioneered by scientists such as
the chemist Justus Liebig. Wundt of course started the trend in psychology
with his Leipzig laboratory (see chap. 5). Leipzig did psychologya great service:
Psychophysics and its methods of inquiry introduced psychology as a
serious science, as something realist, with numbers, calculations, and equations.
The systematization, mechanization, and quantification of psychological
research in Leipzig, however, must be seen as an antimovement against
earlier introspection and rationalism.
Echoes of Leipzig still sound loudly today. A quantitativist preference remains
strong in human factors. Empiricist appeals (the pursuit of real measurable
facts through experiment) and a strong reliance on Cartesian-Newtonian
interpretations of natural science equal to those of, say, physics, may
176 CHAPTER 9
help human factors retain credibility in a world of constructed hardware
and engineering science, where it alone dabbles in the fuzziness of psychology.
In a way, then, quantitativist human factors or engineering psychology
is still largely the sort of antimovement that Wundt formed with his Leipzig
laboratory. It finds its expression in a pursuit of numbers and statistics, lest
engineering consumers of the research results (and their government or
other sponsors) suspect the results to be subjective and untrustworthy.
The quantification and mechanization of mind and method in human
factors are good only because they are not something else (i.e., foggy rationalism
or unreliable introspection), not because they are inherently
good or epistemologically automatically justifiable. The experimental method
is good for what it is not, not for what it is. One can see this in the fact that
quantitative research in mainstream human factors never has to justify its
method (that method is good because at least it is not that other, vague
stuff). Qualitative research, on the other hand, is routinely dismissed as insufficiently
empirical and will always be required to justify its method. Anything
perceived to be sliding toward rationalism, subjectivism, and nonsystematic
introspection is highly suspicious, not because it is, but because
of what it evokes: a fear that human factors will be branded unscientific.
Now these fears are nothing new. They have inspired many a split or departure
in the history of psychology. Recall Watson's main concern when
launching behaviorism. It was to rescue psychology from vague subjectivist
introspection (by which he even meant Wundt's systematic, experimental
laboratory research) and plant it firmly within the natural science tradition.
Ever since Newton read the riot act on what scientific was to be, psychology
and human factors have struggled to find an acceptance and an acceptability
within that conceptualization.
Misconceptions About the Qualitative-Quantitative
Relationship
Whether quantitative or qualitative research can make more valid claims
about the future (thereby helping in the certification of a system as safe to
use) is contested. At first sight, qualitative, or field studies, are about the
present (otherwise there is no field to study). Quantitative research may
test actual future systems, but the setting is typically so contrived and limited
that its relationship to a real future is tenuous. As many have pointed
out, the difference between quantitative and qualitative research is actually
not so great (e.g., Woods, 1993; Xiao & Vicente, 2000). Claims of epistemological
privilege by either are counterproductive, and difficult to substantiate.
A method becomes superior only if it better helps researchers answer
the question they are pursuing, and in this sense, of course, the differences
177 WILL THE SYSTEM BE SAFE?
between qualitative and quantitative research can be real. But dismissing
qualitative work as subjective misses the point of quantitative work. Squeezing
numbers out of an experimental encounter with reality, and then closing
the gap to a concept-dependent conclusion on what you just saw, requires
generous helpings of interpretation. As we see in the following
discussion, there is a great deal of subjectivism in endowing numbers with
meaning. Moreover, seeing qualitative inquiry as a mere protoscientific
prelude to real quantitative research misconstrues the relationship and
overestimates quantitative work. A common notion is that qualitative work
should precede quantitative research by generating hypotheses that can
then be tested in more restricted settings. This may be one relationship.
But often quantitative work only reveals the how or what (or how much) of
a particular phenomenon. Numbers in themselves can have a hard time revealing
the why of the phenomenon. In this case, quantitative work is the
prelude to real qualitative research: It is experimental number crunching
that precedes and triggers the study of meaning.
Finally, a common claim is that qualitative work is high in external validity
and low in internal validity. Quantitative research, on the other hand, is
thought to be low in external validity and high in internal validity. This is
often used as justification for either approach and it must rank among the
most misconstrued arguments in scientific method. The idea is that internal
validity is high because experimental laboratory research allows an investigator
almost full control over the conditions in which data are gathered.
If the experimenter did not make it happen, either it did not happen,
or the experimenter knows about it, so that it can be dealt with as a confound.
But the degree of control in research is often overestimated. Laboratory
settings are simply another kind of contextualized setting, in which
all kinds of subtle influences (social expectations, people's life histories)
enter and influence performance just like they would in any other contextualized
setting. The degree of control in qualitative research, on the
other hand, is often simply assumed to be low. And much qualitative work
indeed adds to that image. But rigor and control is definitely possible in
qualitative work: There are many ways in which a researcher can become
confident about systematic relationships between different factors. Subjectivism
in interpretation is not more necessary in qualitative than in quantitative
research. Qualitative work, on the other hand, is not automatically externally
valid simply because it takes place in a field (applied) setting. Each
encounter with empirical reality, whether qualitative or quantitative, generates
context-specific data—data from that time and place, from those people,
in that language—that are by definition nonexportable to other settings.
The researcher has to engage in analysis of those data in order to
bring them up to a concept-dependent level, from which terms and conclusions
can be taken to other settings.
178 CHAPTER 9
The examples that follow play out these issues. But the account is about
more than the real or imagined opposition between qualitative and quantitative
work. The question is how human factors research, quantitative or
qualitative, can contribute to knowing whether a system will be safe to use.
EXPERIMENTAL HUMAN FACTORS RESEARCH
ON FLIGHT STRIPS: AN EXAMPLE
One way to find out if controllers can control air-traffic without the aid of
flight strips is to test it in an experimental setting. You take a limited number
of controllers, and put them through a short range of tasks to see how
they do. In their experiments, Albright et al. (1996) deployed a wide array
of measurements to find out if controllers perform just as well in a condition
with no strips as in a condition with strips. The work they performed
was part of an effort by the U.S. Federal Aviation Administration, a regulator
(and ultimately the certifier of any future air-traffic control system in
the U.S.). In their study, the existing air-traffic control system was retained,
but to compare stripped versus stripless control, the researchers removed
the flight strips in one condition:
The first set of measurements consisted of the following: total time watching
the PVD [plan view display, or radar screen], number of FPR [flight plan requests],
number of route displays, number of J-rings used, number of conflict
alerts activated, mean time to grant pilot requests, number of unable requests,
number of requests ignored, number of controller-to-pilot requests,
number of controller-to-center requests, and total actions remaining to complete
at the end of the scenario. (Albright et al., p. 6)
The assumption that drives most experimental research is that reality (in
this case about the use and usefulness of flight strips) is objective and that it
can be discovered by the researcher wielding the right measuring instruments.
This is consistent with the structuralism and realism of human factors.
The more measurements, the better, the more numbers, the more you
know. This is assumed to be valid even when an underlying model that
would couple the various measurements together into a coherent account
of expert performance is often lacking (as it is in Albright et al., 1996, but
also in many folk models in human factors). In experimental work, the
number and diversity of measurements can become the proxy indicator of
the accuracy of the findings, and of the strength of the epistemological
claim (Q: So how do you know what you know? A: Well, we measured this,
and this, and this, and that, and . . .). The assumption is that, with enough
quantifiable data, knowledge can eventually be offered that produces an ac-
179 WILL THE SYSTEM BE SAFE?
curate and definitive account of a particular system. More of the same will
eventually lead to something different. The strong influence that engineering
has had on human factors (Batteau, 2001) makes this appear as just
common sense. In engineering, technical debates are closed by amassing
results from tests and experience; the essence of the craft is to convert uncertainty
into certainty. Degrees of freedom are closed through numbers;
ambiguity is worked out through numbers; uncertainty is reduced through
numbers (Vaughan, 1996).
Independent of the number of measurements, each empirical encounter
is of necessity limited, in both place and time. In the case of Albright et
al. (1996), 20 air-traffic controllers participated in two simulated airspace
conditions (one with strips and one without strips) for 25 minutes each.
One of the results was that controllers took longer to grant pilot requests
when they did not have access to flight strips, presumably because they had
to assemble the basis for a decision on the request from other information
sources. The finding is anomalous compared to other results, which
showed no significant difference between workload and ability to keep control
over the traffic situation across the strip-no strip conditions, leading to
the conclusion that "the presence or absence of strips had no effect on either
performance or perceived workload. Apparently, the compensatory
behaviors were sufficient to maintain effective control at what controllers
perceived to be a comparable workload" (Albright et al., p. 11). Albright et
al. explained the anomaly as follows: "Since the scenarios were only 25 minutes
in length, controllers may not have had the opportunity to formulate
strategies about how to work without flight strips, possibly contributing to
the delay" (p. 11).
At a different level, this explanation of an anomalous datum implies that
the correspondence between the experimental setting and a future system
and setting may be weak. Lacking a real chance to learn how to formulate
strategies for controlling traffic without flight strips, it would be interesting
to pursue the question of how controllers in fact remained in control over
the traffic situation and kept their workload down. It is not clear how this
lack of a developed strategy can affect the number of requests granted but
not the perceived workload or control performance. Certifiers may, or perhaps
should, wonder what 25 minutes of undocumented struggle tells them
about a future system that will replace decades of accumulated practice.
The emergence of new work and establishment of new strategies is a fundamental
accompaniment to the introduction of new technology, representing
a transformation of tasks, roles, and responsibilities. These shifts are not
something that could easily be noticed within the confines of an experimental
study, even if controllers were studied for much longer than 25 minutes.
Albright et al. (1996), resolved this by placing the findings of control
performance and workload earlier in their text: "Neither performance nor
180 CHAPTER9
perceived workload (as we measured them in this study) was affected when
the strips were removed" (p. 8). The qualification that pulled the authority
of the results back into the limited time and place of the experimental encounter
(how we measured them in this study), were presented parenthetically
and thus accorded less central importance (Golden-Biddle & Locke,
1993). The resulting qualification suggests that comparable performance
and workload may be mere artifacts of the way the study was conducted, of
how these things were measured at that time and place, with those tools, by
those researchers. The qualification, however, was in the middle of the paper,
in the middle of a paragraph, and surrounded by other paragraphs
adorned with statistical allusions. Nothing of the qualification remained at
the end of the paper, where the conclusions presented these localized findings
as universally applicable truths.
Rhetoric, in other words, is enlisted to deal with problematic areas of
epistemological substance. The transition from localized findings (in this
study the researchers found no difference in workload or performance the
way they measured them with these 20 controllers) to generalizable principles
(we can do away with flight strips) essentially represents a leap of faith.
As such, central points of the argument were left unsaid or were difficult for
the reader to track, follow, or verify. By bracketing doubt this way, Albright
et al. (1996) communicated that there was nothing, really, to doubt. Authority
(i.e., true or accurate knowledge) derives from the replicable, quantifiable
experimental approach. As Xiao and Vicente (2000) argued, it is
very common for quantitative human factors research not to spend much
time on the epistemological foundation of its work. Most often it moves
unreflectively from a particular context (e.g., an experiment) to concepts
(not having strips is safe), from data to conclusions, or from the modeled to
the model. The ultimate resolution of the fundamental constraint on empirical
work (i.e., each empirical encounter is limited to a time and place) is
that more research is always necessary. This is regarded as a highly reasonable
conclusion of most quantitative human factors, or indeed any, experimental
work. For example, in the Albright et al. study, one constraint was
the 25-minute time limit on the scenarios played. Does flight-strip removal
actually change controller strategies in ways that were not captured by the
present study? This would seem to be a key question. But again, the reservation
was bracketed. Whether or not the study answered this question does
not in the end weaken the study's main conclusion: "(Additional research is
necessary to determine if there are more substantial long term effects to
strip removal)" (p. 12).
In addition, the empirical encounter of the Albright et al. (1996) study
was limited because it only explored one group of controllers (upper airspace).
The argument for more research was drafted into service for legitimizing
(not calling into question) results of the study: "Additional studies
181 WILL THE SYSTEM BE SAFE?
should be conducted with field controllers responsible for other types of
sectors (e.g., low altitude arrival, or non-radar) to determine when, or if,
controllers can compensate as successfully as they were able to in the current
investigation" (p. 12). The idea is that more of the same, eventually,
will lead to something different, that a series of similar studies over time will
produce a knowledge increment useful to the literature and useful to the
consumers of the research (certifiers in this case). This, once again, is
largely taken for granted in the human factors community. Findings will invariably
get better next time, and such successive, incremental enhancement
is a legitimate route to the logical human factors end point: the discovery
of an objective truth about a particular human-machine system and,
through this, the revelation of whether it will be safe to use or not.
Experimental work relies on the production of quantifiable data. Some
of this quantification (with statistical ornaments such as F-values and standard
deviations) was achieved in Albright et al. (1996) by converting tickmarks
on lines of a questionnaire (called the "PEQ," or post-experimental
questionnaire) into an ordinal series of digits:
The form listed all factors with a 9.6 centimeter horizontal line next to each.
The line was marked low on the left end and high on the right end. In addition,
a vertical mark in the center of the line signified the halfway mark. The
controllers were instructed to place an X on the line adjacent to the factor to
indicate a response. . . . The PEQ scales were scored by measuring distance
from the right anchor to the mark placed by the controller on a horizontal
line (in centimeters). . . . Individual repeated measures ANOVAs [were then
conducted], (pp. 5-8)
The veneration of numbers in this case, however, went a step too far.
ANOVAs cannot be used for the kind of data gathered through PEQ
scales. The PEQ is made up of so-called ordinal scales. In ordinal scales,
data categories are mutually exclusive (a tickmark cannot be at two distances
at the same time), they have some logical order, and they are
scored according to the amount of a particular characteristic they possess
(in this case, distance in centimeters from the left anchor). Ordinal scales,
however, do not represent equal differences (a distance of 2 cm does not
represent twice as much of the category measured as a distance of 1 cm),
as interval and ratio scales do.
Besides, reducing complex categories such as "usefulness" or "likeability"
to distances along a few lines probably misses out on an interesting
ideographic reality beneath all of the tickmarks. Put in experimental terms,
the operationalization of usefulness as the distance from a tickmark along a
line is not particularly high on internal validity. How can the researcher be
sure that usefulness means the same thing to all responding controllers? If
different respondents have different ideas of what usefulness meant during
182 CHAPTER9
their particular experimental scenario, and if different respondents have
different ideas of how much usefulness a tickmark, say, in the middle of the
line represents, then the whole affair is deeply confounded. Researchers do
not know what they are asking and do not know what they are getting in reply.
Further numeric analysis is dealing with apples and oranges. This is one
of the greater risks of folk modeling in human factors. It assumes that everybody
understands what usefulness means, and that everybody has the same
definition. But these are generous and untested assumptions. It was only
with qualitative inquiry that researchers could ensure that there was some
consensus on understandings of usefulness with respect to the controlling
task with or without strips. Or they could discover that there was no consensus
and then control for it. This would be one way to deal with the confound.
It may not matter, and it may not have been noticed. Numbers are good.
Also, the linear, predictable format of research writing, as well as the use of
abbreviated statistical curios throughout the results section, represent a
rhetoric that endows the experimental approach with its authority—authority
in the sense of privileged access to a particular layer or slice of empirical
reality that others outside the laboratory setting do or do not have
admittance to. Other rhetoric invented particularly for the study (e.g., PEQ
scales for questions presented to participants after their trials in Albright et
al., 1996) certifies the researchers' unique knowledge of this slice of reality.
It validates the researcher's competence to tell readers what is really going
on there. It may dissuade second-guessing. Empirical results are deemed accurate
by virtue of a controlled encounter, a standard reporting format that
shows logical progress to objective truths and statements (introduction,
method, results, discussion, and summary), and an authoritative dialect intelligible
only to certified insiders.
Closing the Gap to the Future
Because of some limited correspondence between the experiment and the
system to be designed, quantitative research seemingly automatically closes
the gap to the future. The stripless condition in the research (even if contrived
by simply leaving out one artifact [the flight strip] from the present)
is a model of the future. It is an impoverished model to be sure, and one
that offers only a partial window onto what future practice and performance
may be like (despite the epistemological reservations about the authenticity
of that future discussed earlier). The message from Albright et
al.'s (1996) encounter with the future is that controllers can compensate
for the lack of flight strips. Take flight strips away, and controllers compensate
for the lack of information by seeking information elsewhere (the ra-
183 WILL THE SYSTEM BE SAFE?
dar screen, flight-plan readouts, controller-to-pilot requests). Someone
might point out that Albright et al. prejudged the use and usefulness of
flight strips in the first few sentences of their introduction, that they did not
see their data as an opportunity to seek alternative interpretations: "Currently,
en route control of high altitude flights between airports depends
on two primary tools: the computer-augmented radar information available
on the Plan View Display (PVD) and the flight information available on the
Flight Progress Strip" (p. 1). This is not really an enabling of knowledge, it
is the imposition of it. Here, flight strips are not seen as a problematic core
category of controller work, whose use and usefulness would be open to negotiation,
disagreement, or multiple interpretations. Instead, flight strips
function as information-retrieval devices. Framed as such, the data and the
argument can really only go one way: By removing one source of information,
controllers will redirect their information-retrieving strategies onto
other devices and sources. This displacement is possible, it may even be desirable,
and it is probably safe: "Complete removal of the strip information
and its accompanying strip marking responsibilities resulted in controllers
compensating by retrieving information from the computer" (Albright et
al., p. 11). For a certifier, this closes a gap to the future: Removing one
source of information will result in people finding the information elsewhere
(while showing no decrement in performance or increment in workload).
The road to automation is open and people will adapt successfully,
for that has been scientifically proven. Therefore, doing away with the flight
strips is (probably) safe, and certifiable as such.
If flight strips are removed, then what other sources of information
should remain available? Albright et al. (1996) inquired about what kind of
information controllers would minimally like to preserve: Route of flight
scored high, as did altitude information and aircraft call sign. Naming these
categories gives developers the opportunity to envision an automated version
of the flight strip that presents the same data in digital format, one that
substitutes a computer-based format for the paper-based one, without any
consequences for controller performance. Such a substitution, however,
may overlook critical factors associated with flight strips that contribute to
safe practice, and that would not be incorporated or possible in a computerized
version (Mackay, 2000).
Any signs of potential ambiguity or ambivalence about what else flight
strips may mean to those working with them were not given further consideration
beyond a brief mention in the experimental research write-up—not
because these signs were actively, consciously stifled, but because they were
inevitably deleted as Albright et al. (1996) carried out and wrote up their
encounter with empirical reality. Albright et al. explicitly solicited qualitative,
richer data from their participants by asking if controllers themselves
felt that the lack of strips impaired their performance. Various controllers
184 CHAPTER 9
indicated how strips help them preplan and that, without strips, they cannot
preplan. The researchers, however, never unpacked the notion of preplanning
or investigated the role of flight strips in it. Again, such notions
(e.g., preplanning) are assumed to speak for themselves, taken to be selfevident.
They require no deconstruction, no further interpretive work.
Paying more attention to these qualitative responses could create noise that
confounds experimental accuracy. Comments that preplanning without
strips was impossible hinted at flight strips as a deeper, problematic category
of controller work. But if strips mean different things to different controllers,
or worse, if preplanning with strips means different things to different
controllers, then the experimental bedrock of comparing comparable
people across comparable conditions would disappear. This challenges in a
profound way the nomothetic averaging out of individual differences.
Where individual differences are the nemesis of experimental research, interpretive
ambiguity can call into question the legitimacy of the objective
scientific enterprise.
QUALITATIVE RESEARCH ON FLIGHT STRIPS:
AN EXAMPLE
Rather than looking at people's work from the outside in (as do quantitative
experiments), qualitative research tries to understand people's work
from the inside out. When taking the perspective of the one doing the
work, how does the world look through his or her eyes? What role do tools
play for people themselves in the accomplishments of their tasks; how do
tools affect their expression of expertise? An interpretive perspective is
based on the assumption that people give meaning to their work and that
they can express those meanings through language and action. Qualitative
research interprets the ways in which people make sense of their work experiences
by examining the meanings that people use and construct in light
of their situation (Golden-Biddle & Locke, 1993).
The criteria and end points for good qualitative research are different
than from those in quantitative research. As a research goal, accuracy is
practically and theoretically unobtainable. Qualitative research is relentlessly
empirical, but it rarely achieves finality in its findings. Not that quantitative
research ever achieves finality (remember that virtually every experimental
report finishes with the exhortation that more research is
necessary). But qualitative researchers admit that there is never one accurate
description or analysis of a system in question, no definitive account—
only versions. What flight strips exactly do for controllers is forever subject
to interpretation; it will never be answered objectively or finitely, never be
closed to further inquiry. What makes a version good, though, or credible,
185 WILL THE SYSTEM BE SAFE?
or worth paying attention to by a certifier, is its authenticity. The researcher
has to not only convince the certifier of a genuine field experience in writing
up the research account, but also make intelligible what went on
there. Validation from outside the field emerges from an engagement
with the literature (What have others said about similar contexts?) and
from interpretation (How well are theory and evidence used to make
sense of this particular context?). Field research, though critical to the
ethnographic community as a stamp of authenticity, is not necessarily the
only legitimate way to generate qualitative data. Surveys of user populations
can also be tools that support qualitative inquiry.
Find Out What the Users Think
The reason that qualitative research may appeal to certifiers is that it lets
the informants, the users, speak—not through the lens of an experiment,
but on the users' terms and initiative. Yet this is also where a central problem
lies. Simply letting users speak can be of little use. Qualitative research
is not (or should not be) plain conversational mappings—a direct transfer
from field setting to research account. If human factors would (or continues
to) practice and think about ethnography in these terms, doubts about
both the method and the data it yields will continue to surface. What
certifiers, as consumers of human factors research, care about is not what
users say in raw, unpacked form, but about what their remarks mean for
work, and especially for future work. As Hughes et al. (1993) put it: "It is not
that users cannot talk about what it is they know, how things are done, but it
needs bringing out and directing toward the concerns of the design itself"
(p. 138). Within the human factors community, qualitative research seldom
takes this extra step. What human factors requires is a strong ethnography,
one that actually makes the hard analytical move from user statements to a
design language targeted at the future.
A massive qualitative undertaking related to flight strips was the Lancaster
University project (Hughes et al., 1993). Many man-months were spent
(an index of the authenticity of the research) observing and documenting
air-traffic control with flight strips. During this time the researchers developed
an understanding of flight strips as an artifact whose functions derive
from the controlling work itself. Both information and annotations on the
strip and the active organization of strips among and between controllers
were essential: "The strip is a public document for the members of the
(controlling) team; a working representation of an aircraft's control history
and a work site of controlling. Moving the strips is to organize the information
in terms of work activities and, through this, accomplishing the
work of organizing the traffic" (Hughes et al., pp. 132-133). Terms such
as working representation and organizing traffic are concepts, or categories,
186 CHAPTER 9
that were abstracted well away from the masses of deeply context-specific
field notes and observations gathered in the months of research. Few controllers
would themselves use the term working representation to explain
what flight strips mean to them. This is good. Conceptual abstraction allows
a researcher to reach a level of greater generality and increased
generalizability (see Woods, 1993; Xiao & Vicente, 2000). Indeed, working
representation may be a category that can lead to the future, where a
designer would be looking to computerize a working representation of
flight information, and a certifier would be evaluating whether such a
computerized tool is safe to use. But such higher order interpretive work
is seldom found in human factors research. It would separate ethnography
and ethnographic argument from research that simply makes claims
based on authenticity. Even Hughes et al. (1993) relied on authenticity
alone when they told of the various annotations made on flight strips, and
did little more than parrot their informants:
Amendments may be done by the controller, by the chief, or less often, by one
of the "wings." "Attention-getting" information may also be written on the
strips, such as arrows indicating unusual routes, symbols designating
"crossers, joiners and leavers" (that is, aircraft crossing, leaving or joining the
major traffic streams), circles around unusual destinations, and so on. (p.
132)
Though serving as evidence of socialization, of familiarity and intimacy,
speaking insider language is not enough. By itself it is not helpful to
certifiers who may be struggling with evaluating a version of air-traffic control
without paper flight strips. Appeals to authenticity ("Look, I was there,
and I understand what the users say") and appeals to future relevance
("Look, this is what you should pay attention to in the future system") can
thus pull in opposite directions: the former toward more the context specific
that is hardly generalizable, the latter toward abstracted categories of
work that can be mapped onto yet-to-be-fielded future systems and conceptions
of work. The burden to resolve the tension should not be on the
certifier or the designer of the system, it should be on the researcher.
Hughes et al. (1993) agreed that this bridge-building role should be the researcher's:
Ethnography can serve as another bridge between the users and the designers.
In our case, controllers have advised on the design of the display tool with
the ethnographer, as someone knowledgeable about but distanced from the
work, and, on the one hand able to appreciate the significance of the controllers'
remarks for their design implications and, on the other hand, familiar
enough with the design problems to relate them to the controllers' experiences
and comments, (p. 138)
187 WILL THE SYSTEM BE SAFE?
Hostage to the Present, Mute About the Future
Hughes et al. (1993) research account actually missed the "significance of
controller remarks for their design implications" (p. 138). No safety implications
were extracted. Instead the researchers used insider language to
forward insider opinions, leaving user statements unpacked and largely
underanalyzed. Ethnography essentially gets confused with what informants
say and consumers of the research are left to pick and choose among
the statements. This is a particularly naive form of ethnography, where what
informants can tell researchers is equated or confused with what strong, analytical
ethnography (and ethnographic argument) could reveal. Hughes
et al. relied on informant statements to the extent they did because of a
common belief that the work that their informants did, and the foundational
categories that informed it are for the most part self-evident; close to
what we would regard as common sense. As such, they require little, if any,
analytic effort to discover. It is an ethnography reduced to a kind of mediated
user show-and-tell for certifiers—not as thorough analysis of the foundational
categories of work. For example, Hughes et al concluded that
"(flight strips) are an essential feature of 'getting the picture,' 'organising
the traffic,' which is the means of achieving the orderliness of the traffic"
(p. 133).
So flight strips help controllers get the picture. This kind of statement is
obvious to controllers and merely repeats what everyone already knows. If
ethnographic analysis cannot go beyond common sense, it merely privileges
the status quo. As such, it offers certifiers no way out: A system without
flight strips would not be safe, so forget it. There is no way for a certifier to
circumvent the logical conclusion of Hughes et al. (1993): "The importance
of the strip to the controlling process is difficult to overestimate" (p.
133). So is it safe? Going back to Hughes et al.: "For us, such questions were
not easily answerable by reference to work which is as subtle and complex as
our ethnographic analysis had shown controlling to be" (p. 135).
Such surrender to the complexity and intricacy of a particular phenomenon
is consistent with what Dawkins (1986, p. 38) called the "argument
from personal incredulity." When faced with highly complicated machinery
or phenomena, it is easy to take cover behind our own sense of extreme
wonder, and resist efforts at explanation. In the case of Hughes et al.
(1993), it recalls an earlier reservation: "The rich, highly detailed, highly
textured, but nevertheless partial and selective descriptions associated with
ethnography would seem to contribute little to resolving the designers
problem where the objective is to determine what should be designed and
how" (p. 127).
Such justification ("It really is too complex and subtle to communicate
to you") maneuvers the entire ethnographic enterprise out of the certifier's
188 CHAPTER 9
view as something not particularly helpful. Synthesizing the complexity and
subtlety of a setting should not be the burden of the certifier. Instead, this is
the role of the researcher; it is the essence of strong ethnography. That a
phenomenon is remarkable does not mean it is inexplicable; so if we are
unable to explain it, "we should hesitate to draw any grandiose conclusions
from the fact of our own inability" (Dawkins, 1986, p. 39).
Informant remarks such as "Flight strips help me get the mental picture"
should serve as a starting point for qualitative research, not as its conclusion.
But how can researchers move from native category to analytic sense?
Qualitative work should be hermeneutic and circular in nature: not aiming
for a definitive description of the target system, but rather a continuous reinterpretation
and reproblematization of the successive layers of data
mined from the field. Data demand analysis. Analysis in turn guides the
search for more data, which in turn demand further analysis: Categories are
continually revised to capture the researcher's (and, hand in hand, the
practitioner's) evolving understanding of work. There is a constant interplay
between data, concepts, and theory.
The analysis and revision of categories is a hallmark of strong ethnography,
and Ross's (1995) study of flight-progress strips in Australia serves as
an interesting example. Qualitative in nature, Ross's research relied on surveys
of controllers using flight strips in their current work. Surveys are often
derided by qualitative researchers for imposing the researcher's understanding
of the work onto the data, instead of the other way around
(Hughes et al., 1993). Demonstrating that it is not just the empirical encounter
or rhetorical appeals to authenticity that matter (through large
numbers of experimental probes or months of close observation), the survey
results Ross gathered were analyzed, coded, categorized, receded and
recategorized until the inchoate masses of context-specific controller remarks
began to form sensible, generalizable wholes that could meaningfully
speak to certifiers.
Following previous categorizations of flight-strip work (Delia Rocco,
Manning, & Wing, 1990), Ross (1995) moves down from these conceptual
descriptions of controller work and up again from the context-specific details,
leaving several layers of intermediate steps. In line with characterizations
of epistemological analysis through abstraction hierarchies (see Xiao
& Vicente, 2000), each step from the bottom up is more abstract than the
previous one; each is cast less in domain-bound terms and more in concept dependent
terms than the one before. Beyer and Holtzblatt (1998) referred
to this process as induction: reasoning from the particular to the general.
One example from Ross (p. 27) concerns domain-specific controller
activities such as "entering a pilot report; composing a flight plan amendment."
These lower level, context-specific data are of course not without
semantic load themselves: it is always possible to ask further questions and
189 WILL THE SYSTEM BE SAFE?
descend deeper into the world of meanings that these simple, routine activities
have for the people who carry them out. Indeed, we have to ask if we
can only go up from the context-specific level—maintained in human factors
as the most atomistic, basic, low-level data set (see Woods, 1993). In
Ross's data, researchers should still question the common sense behind the
otherwise taken-for-granted entering of a pilot report: What does a pilot report
mean for the controller in a particular context (e.g., weather related),
what does entering this report mean for the controller's ability to manage
other traffic issues in the near future (e.g., avoiding sending aircraft into severe
turbulence)?
While alluding to even more fine-grained details and questions later,
these types of activities also point to an intentional strategy at a higher level
of analysis (Delia Rocco et al., 1990): that of the "transformation or translation
of information for entry into the system," which, at an even higher
level of analysis, could be grouped under a label coding, together with
other such strategies (Ross, 1995, p. 27). Part of this coding is symbolic, in
that it uses highly condensed markings on flight strips (underlining, black
circles, strike-throughs) to denote and represent for controllers, what is going
on. The highly intricate nature of even one flight (where it crosses vs.
where it had planned to cross a sector boundary, what height it will be leaving
when, whether it has yet contacted another frequency, etc.) can be collapsed
or amortized by simple symbolic notation—one line or circle
around a code on the strip that stands for a complex, multidimensional
problematic that other controllers can easily recognize. Unable to keep all
the details of what a flight would do stable in the head, the controller compresses
complexity, or amortizes it, as Hollan, Hutchins, and Kirsh (2000)
would say, by letting one symbol stand for complex concepts and interrelationships,
some even temporal.
Similarly, "recognizing a symbol for a handoff' (on a flight strip), though
allowing further unpacking (e.g., what do you mean "recognize"?), is an instance
of a tactic that "transforms or translates information received,"
which in turn represents a larger controller competency of "decoding,"
which in its turn is also part of a strategy to use symbolic notation to collapse
or amortize complexity (Ross, 1995, p. 27). From recognizing a symbol
for a hand-off to the collapsing of complexity, there are four steps, each
more abstract and less in domain terms than the one before. Not only do
these steps allow others to assess the analytical work for its worth, but the
destination of such induction is actually a description of work that can be
used for guiding the evaluation of a future system. Inspired by Ross's analysis,
we can surmise that controllers rely on flight strips for:
• Amortizing or collapsing complexity (what symbolic notation conveys)
.
190 CHAPTER 9
• Supporting coordination (who gets which flight strip next from
whom).
• Anticipating dynamics (how much is to come, from where, when, in
what order).
These (no longer so large) jumps to the highest level of abstraction can
now be made—identifying the role the flight strip has in making sense of
workplace and task complexity. Although not so much a leap of faith any
longer (because there are various layers of abstraction in between), the final
step, up to the highest level conceptual description, still appears to hold
a certain amount of creative magic. Ross (1995) revealed little of the mechanisms
that actually drive his analysis. There is no extensive record that
tracks the transformation of survey data into conceptual understandings of
work. Perhaps these transformations are taken for granted too: The mystery
is left unpacked because it is assumed to be no mystery. The very process by
which the researcher manages to migrate from user-language descriptions
of daily activities to conceptual languages less anchored in the present, remains
largely hidden from view. No ethnographic literature guides specifically
the kinds of inferences that can be drawn up to the highest level of
conceptual understanding. At this point, a lot of leeway is given (and reliance
placed on) the researcher and his or her (keenly) developed insight
into what activities in the field really mean or do for people who carry them
out. The problems of this final step are known and acknowledged in the
qualitative research community. Vaughan (1996) and other sociologists referred
to it as making the macro-micro connection: locating general meaning
systems (e.g., symbolic notation, off-loading) in local contexts (placing
a circle around a set of digits on the flight strip). Geertz (1973) noted how
inferences that try to make the macro-micro connection often resemble
"perfected impressionism" in which "much has been touched but little
grasped" (p. 312). Such inferences tend to be evocative, resting on suggestion
and insinuation more than on analysis (Vaughan, 1996).
In qualitative research, lower levels of analysis or understanding always
underconstrain the inferences that can be drawn further on the way to
higher levels (see Hollan et al., 2000). At each step, alternative interpretations
are possible. Qualitative work does not arrive at a finite description of
the system or phenomenon studied (nor does quantitative research, really).
But qualitative work does not even aim or pretend to do so (Batteau, 2001).
Results are forever open to further interpretation, forever subject to increased
problematization. The main criterion, therefore, to which we
should hold the inferences drawn is not accuracy (Golden-Biddle & Locke,
1993), but plausibility: Does the conceptual description make sense—especially
to the informants, to the people who actually do the work? This also
motivates the continuous, circular nature of qualitative analysis: reinter-
191 WILL THE SYSTEM BE SAFE?
preting results that have been interpreted once already, gradually developing
a theory—a theory of why flight strips help controllers know what is
going on that is anchored in the researcher's continually evolving understanding
of the informants' work and their world.
Closing the Gap to the Future
The three high-level categories of controller (flight-strip) work tell certifiers
that air-traffic controllers have developed strategies for dealing with
the communication of complexity to other controllers, for predicting workload
and planning future work. Flight strips play a central, but not necessarily
exclusive, role. The research account is written up in such a way that the
status quo does not get the prerogative: Tools other than flight strips could
conceivably help controllers deal with complexity, dynamics, and coordination
issues. Complexity and dynamics, as well as coordination, are critical to
what makes air-traffic control what it is, including difficult. Whatever certifiers
will want to brand as safe to use, they would do well to take into account
that controllers use their artifact(s) to help them deal with complexity,
to help them anticipate dynamic futures, and to support their
coordination with other controllers. This resembles some kind of human
factors requirements that could provide a certifier with meaningful input.
CERTIFYING UNDER UNCERTAINTY
One role of human factors is to help developers and certifiers judge
whether a technology is safe for future use. But quantitative and qualitative
human factors communities both risk taking the authority of their findings
for granted and regarding the translation to future, and claims about the
future being either safe or unsafe, as essentially nonproblematic. At least
the literature (both literatures) are relatively silent on this fundamental issue.
Yet neither the legitimacy of findings nor the translation to claims
about the future is in fact easily achieved, or should be taken for granted.
More work needs to be done to produce findings that make sense for those
who have to certify a system as safe to use. Experimental human factors research
can claim empirical legitimacy by virtue of the authority vested in
the laboratory researcher and the control over the method used to get data.
Such research can speak meaningfully to future use because it tests microversions
of a future system. Researchers, however, should explicitly indicate
where the versions of the future they tested are impoverished, and what
subtle effects of context on their experimental settings could produce findings
that diverge from what future users will encounter.
192 CHAPTER 9
Qualitative research in human factors can claim legitimacy, and relevance
to those who need to certify the next system, because of its authentic
encounters with the field where people actually carry out the work. Validation
emerges from the literature (what others have said about the same and
similar contexts) and from interpretation (how theory and evidence make
sense of this particular context). Such research can speak meaningfully to
certification issues because it allows users to express their preferences,
choices and apprehensions. Qualitative human factors research, however,
must not stop at recording and replaying informant statements. It must
deconfound informant understandings with understandings informed by
concepts, theory, analysis and literature.
Human factors work, of whatever kind, can help bridge the gap from research
findings to future systems. Research accounts need to be both convincing
as science and cast in a language that allows a certifier to look ahead
to the future: looking ahead to work and a co-evolution of people and technology
in a system that does not yet exist.
Chapter 10
Should We Hold People
Accountable for Their Mistakes?
Transportation human factors has made enormous progress over the past
decades. It would be easy to claim that transportation systems have become
safer in part through human factors efforts. As a result of such work over
the past decades, progress on safety has become synonymous with:
• Taking a systems perspective: Accidents are not caused by failures of individuals,
but emerge from the conflux or alignment of multiple contributory
system factors, each necessary and only jointly sufficient. The
source of accidents is the system, not its component parts.
• Moving beyond blame: Blame focuses on the supposed defects of individual
operators and denies the import of systemic contributions. In
addition, blame has all kinds of negative side effects. It typically leads
to defensive posturing, obfuscation of information, protectionism, polarization,
and mute reporting systems.
Progress on safety coincides with learning from failure. This makes punishment
and learning two mutually exclusive activities: Organizations can
either learn from an accident or punish the individuals involved in it, but
hardly do both at the same time. The reason is that punishment of individuals
can protect false beliefs about basically safe systems, where humans are
the least reliable components. Learning challenges and potentially changes
the belief about what creates safety. Moreover, punishment emphasizes that
failures are deviant, that they do not naturally belong in the organization.
Learning means that failures are seen as normal, as resulting from the in193
194 CHAPTER 10
herent pursuit of success in resource-constrained, uncertain environments.
Punishment turns the culprits into unique and necessary ingredients for
the failure to happen. Punishment, rather than helping people avoid or
better manage conditions that are conducive to error, actually conditions
people not to get caught when errors do occur. This stifles learning. Finally,
punishment is about the search for closure, about moving beyond and away
from the adverse event. Learning is about continuous improvement, about
closely integrating the event in what the system knows about itself.
Making these ideas stick, however, is not proving as easy as it was to develop
them. In the aftermath of several recent accidents and incidents,
the operators involved (pilots or air-traffic controllers in these cases) were
charged with criminal offenses (e.g., professional negligence, manslaughter).
In some accidents even organizational management has been held
criminally liable. Criminal charges differ from civil lawsuits in many respects.
Most obviously, the target is not an organization, but individuals
(air-traffic controllers, flight crew, maintenance technicians). Punishment
consists of possible incarceration or some putatively rehabilitative alternative
—not (just) financial compensation. Unlike organizations covered
against civil suits, few operators or managers themselves have insurance to
pay for legal defense against criminal charges that arise from doing their
jobs.
Some maintain that criminally pursuing operators or managers for erring
on the job is morally unproblematic. The greater good befalls the
greater number of people (i.e., all potential passengers) by protecting
them from unreliable operators. A lot of people win, only a few outcasts
lose. To human factors, however, this may be utilitarianism inverted. Everybody
loses when human error gets criminalized: Upon the threat of criminal
charges, operators stop sending in safety-related information; incident
reporting grinds to a halt. Criminal charges against individual operators
also polarize industrial relations. If the organization wants to limit civil liability,
then official blame on the operator could deflect attention from upstream
organizational issues related to training, management, supervision,
and design decisions. Blaming such organizational issues, in contrast, can
be a powerful ingredient in an individual operator's criminal defense—certainly
when the organization has already rendered the operator expendable
by euphemism (standby, ground duty, administrative leave) and without
legitimate hope of meaningful re-employment. In both cases, industrial
relations are destabilized. Intra-organizational battles become even more
complex when individual managers get criminally pursued; defensive maneuvering
by these managers typically aims to off-load the burden of blame
onto other departments or parts of the organization. This easily leads to
poisonous relations and a crippling of organizational functioning. Finally,
incarceration or alternative punishment of operators or managers has no
195 HOLDING PEOPLE ACCOUNTABLE
demonstrable rehabilitative effect (perhaps because there is nothing to rehabilitate)
. It does not make an operator or manager any safer, nor is there
evidence of vicarious learning (learning by example and fear of punishment)
. Instead, punishment or its threat merely leads to counterproductive
responses, to people ducking the debris.
The transportation industry itself shows ambiguity with regard to the
criminalization of error. Responding to the 1996 Valujet accident, where
mechanics loaded oxygen generators into the cargo hold of a DC-9, which
subsequently caught fire, the editor of Aviation Week and Space Technology
"strongly believed the failure of SabreTech employees to put caps on oxygen
generators constituted willful negligence that led to the killing of 110
passengers and crew. Prosecutors were right to bring charges. There has to
be some fear that not doing one's job correctly could lead to prosecution"
(North, 2000, p. 66). Rescinding this 2 years later, however, North (2002)
opined that learning from accidents and criminal prosecution go together
like "oil and water, cats and dogs," that "criminal probes do not mix well
with aviation accident inquiries" (p. 70). Most other cases reveal similar instability
with regard to prosecuting operators for error. Culpability in aviation
does not appear to be a fixed notion, connected unequivocally to features
of some incident or accident. Rather, culpability is a highly flexible
category. Culpability is negotiable, subject to national and professional interpretations,
influenced by political imperatives and organizational pressures,
and part of personal or institutional histories.
As psychologists point out, culpability is also about assumptions we make
about the amount of control people had when carrying out their (now)
controversial acts. The problem here is that hindsight deeply confounds
such judgments of control. In hindsight, it may seem obvious that people
had all the necessary data available to them (and thus the potential for control
and safe outcomes). Yet they may have willingly ignored this data in order
to get home faster, or because they were complacent. Retrospect and
the knowledge of outcome deeply affect our ability to judge human performance,
and a reliance on folk models of phenomena like complacency,
situation awareness, and stress does not help. All too quickly we come to the
conclusion that people could have better controlled the outcome of a situation,
if only they had invested a little more effort.
ACCOUNTABILITY
What is accountability, and what does it actually mean to hold people accountable
for their mistakes? Social cognition research shows that accountability
or holding people accountable is not that simple. Accountability is
fundamental to any social relation. There is always an implicit or explicit ex-
196 CHAPTER 10
pectation that we may be called on to justify our beliefs and actions to others.
The social-functionalist argument for accountability is that this expectation
is mutual: As social beings we are locked into reciprocating relationships.
Accountability, however, is not a unitary concept—even if this is what
many stakeholders may think when aiming to improve people's performance
under the banner of holding them accountable. There are as many
types of accountability as there are distinct relationships among people,
and between people and organizations, and only highly specialized subtypes
of accountability actually compel people to expend more cognitive effort.
Expending greater effort, moreover, does not necessarily mean better
task performance, as operators may become concerned more with limiting
exposure and liability than with performing well (Lerner & Tetlock, 1999),
something that can be observed in the decline of incident reporting with
threats of prosecution (North, 2002). What is more, if accounting is perceived
as illegitimate, for example, intrusive, insulting, or ignorant of real
work, then any beneficial effects of accountability will vanish or backfire. Effects
that have been experimentally demonstrated include a decline in motivation,
excessive stress, and attitude polarization, and the same effects can
be seen in recent cases where pilots and air-traffic controllers were held accountable
by courts and other parties ignorant of the real trade-offs and dilemmas
that make up actual operational work.
The research base on social cognition, then, tells us that accountability,
even if inherent in human relationships, is not unambiguous or unproblematic.
The good side of this is that, if accountability can take many forms,
then alternative, perhaps more productive avenues of holding people accountable
are possible. Giving an account, after all, does not have to mean
exposing oneself to liability, but rather, telling one's story so that others can
learn vicariously. Many sources, even within human factors, point to the
value of storytelling in preparing operators for complex, dynamic situations
in which not everything can be anticipated. Stories are easily remembered,
scenario-based plots with actors, intentions, clues, and outcomes that in
one way or another can be mapped onto current difficult situations and
matched for possible ways out. Incident-reporting systems can capitalize on
this possibility, whereas more incriminating forms of accountability actually
retard this very quality by robbing from people the incentive to tell stories
in the first place.
ANTHROPOLOGICAL UNDERSTANDINGS OF BLAME
The anthropologist is not intrigued by flaws in people's reasoning process
that produce for example, the hindsight bias, but wants to know something
about casting blame. Why is blame a meaningful response for those doing
197 HOLDING PEOPLE ACCOUNTABLE
the blaming? Why do we turn error into crime? Mary Douglas (1992) described
how peoples are organized in part by the way in which they explain
misfortune and subsequently pursue retribution or dispense justice. Societies
tend to rely on one dominant model of possible cause from which they
construct a plausible explanation. In the moralistic model, for example,
misfortune is seen as the result of offending ancestors, of sinning, or of
breaking some taboo. The inflated, exaggerated role that procedure violations
(one type of sinning or taboo breaking) are given in retrospective accounts
of failure represent one such use for moralistic models of breakdown
and blame. The moralistic explanation (you broke the rule, then you
had an accident) is followed by a fixed repertoire of obligatory actions that
follow on that choice. If taboos have been broken, then rehabilitation can
be demanded through expiatory actions. Garnering forgiveness through
some purification ritual is one example. Forcing operators to publicly offer
their apologies is a purification ritual seen in the wake of some accidents.
Moreover, the rest of the community is reminded to not sin, to not break
the taboos, lest the same fate befall them. How many reminders are there in
the transportation industry imploring operators to "follow the rules," to
"follow the procedures"? These are moralistic appeals with little demonstrable
effect on practice, but they may make industry participants feel better
about their systems; they may make them feel more in control.
In the extrogenous model, external enemies of the system are to blame
for misfortune, a response that can be observed even today in the demotion
or exile of failed operators: pilots or controllers or technicians. These people
are expost facto relegated to a kind of underclass that no longer represents
the professional corps. Firing them is one option, and is used relatively
often. But there are more subtle expressions of the extrogenous
model too. The ritualistic expropriation of badges, certificates, stripes, licenses,
uniforms, or other identity and status markings in the wake of an accident
delegitimizes the errant operator as a member of the operational
community. A part of such derogation, of course, is psychological defense
on the part of (former) colleagues who would need to distance themselves
from a realization of equal vulnerability to similar failures. Yet such delegitimization
also makes criminalization easier by beginning the incremental
process of dehumanizing the operator in question. Wilkinson (1994) presented
an excellent example of such demonizing in the consequences that
befell a Boeing 747 pilot after allegedly narrowly missing a hotel at London
Heathrow airport in thick fog. Demonizing there was incremental in the
sense that it made criminal pursuit not only possible in the first place, but
subsequently necessary. It fed on itself: Demons such as this pilot would
need to be punished, demoted, exorcised. The press had a large share in
dramatizing the case, promoting the captain's dehumanization to the point
where his suicide was the only way out.
198 CHAPTER 10
Failure and Fear
Today, almost every misfortune is followed by questions centering on
"whose fault?" and "what damages, compensation?" Every death must be
chargeable to somebody's account. Such responses approximate the primitives'
resistance to the idea of natural death remarkably well (Douglas,
1992). Death, even today, is not considered natural—it has to arise from
some type of identifiable cause. Such resistance to the notion that deaths
actually can be accidental is obvious in responses to recent mishaps. For example,
Snook (2000) commented on his own disbelief, his struggle, in analyzing
the friendly shoot-down of two U.S. Black Hawk helicopters by U.S.
Fighter Jets over Northern Iraq in 1993:
This journey played with my emotions. When I first examined the data, I went
in puzzled, angry, and disappointed—puzzled how two highly trained Air
Force pilots could make such a deadly mistake; angry at how an entire crew of
AWACS controllers could sit by and watch a tragedy develop without taking
action; and disappointed at how dysfunctional Task Force OPC must have
been to have not better integrated helicopters into its air operations. Each
time I went in hot and suspicious. Each time I came out sympathetic and unnerved.
... If no one did anything wrong; if there were no unexplainable surprises
at any level of analysis; if nothing was abnormal from a behavioral and
organizational perspective; then what have we learned? (p. 203)
Snook (2000) confronted the question of whether learning, or any kind
of progress on safety, is possible at all if we can find no wrongdoing, no surprises,
if we cannot find some kind of deviance. If everything was normal,
then how could the system fail? Indeed, this must be among the greater
fears that define Western society today. Investigations that do not turn up a
"Eureka part," as the label became in the TWA800 probe, are feared not because
they are bad investigations, but because they are scary. Philosophers
like Nietzsche pointed out that the need for finding a cause is fundamental
to human nature. Not being able to find a cause is profoundly distressing; it
creates anxiety because it implies a loss of control. The desire to find a
cause is driven by fear. So what do we do if there is no Eureka part, no fault
nucleus, no seed of destruction? Is it possible to acknowledge that failure
results from normal people doing business as usual in normal organizations?
Not even many accident investigations succeed at this. As Galison
(2000) noted:
If there is no seed, if the bramble of cause, agency, and procedure does not issue
from a fault nucleus, but is rather unstably perched between scales, between
human and non-human, and between protocol and judgment, then
the world is a more disordered and dangerous place. Accident reports, and
199 HOLDING PEOPLE ACCOUNTABLE
much of the history we write, struggle, incompletely and unstably, to hold that
nightmare at bay. (p. 32)
Galison's (2000) remarks remind us of this fear (this nightmare) of not
being in control over the systems we design, build, and operate. We dread
the possibility that failures emerge from the intertwined complexity of normal
everyday systems interactions. We would rather see failures emanate
from a traceable, controllable single seed or nucleus. In assigning cause, or
in identifying our imagined core of failure, accuracy does not seem to matter.
Being afraid is worse than being wrong. Selecting a scapegoat to carry
the interpretive load of an accident or incident is the easy price we pay for
our illusion that we actually have control over our risky technologies. This
price is the inevitable side effect of the centuries-old pursuit of Baconian
control and technological domination over nature. Sending controllers, or
pilots, or maintenance technicians to jail may be morally wrenching (but
not unequivocally so—remember North, 2000), but it is preferable over its
scary alternative: acknowledging that we do not enjoy control over the risky
technologies we build and consume. The alternative would force us to really
admit that failure is an emergent property, that "mistake, mishap and
disaster are socially organized and systematically produced by social structures,"
that these mistakes are normal, to be expected because they are "embedded
in the banality of organizational life" (Vaughan, 1996, p. xiv). It
would force us to acknowledge the relentless inevitability of mistake in organizations,
to see that harmful outcomes can occur in the organizations
constructed to prevent them, that harmful consequences can occur even
when everybody follows the rules.
Preferring to be wrong over being afraid in the identification of cause
overlaps with the common reflex toward individual responsibility in the
West. Various transportation modes (particularly aviation) have exported
this bias to less individually oriented cultures as well. In the Western intellectual
tradition since the Scientific Revolution, it has seemed self-evident
to evaluate ourselves as individuals, bordered by the limits of our minds and
bodies, and evaluated in terms of our own personal achievements. From
the Renaissance onward, the individual became a central focus, fueled in
part by Descartes' psychology that created "self-contained individuals"
(Heft, 2001). The rugged individualism developed on the back of mass European
immigration into North America in the late 19th and early 20th
centuries accelerated the image of independent, free heroes accomplishing
greatness against all odds, and antiheroes responsible for disproportionate
evildoing (e.g., Al Capone). Lone antiheroes still play the lead roles
in our stories of failure. The notion that it takes teamwork, or an entire organization,
an entire industry (think about Alaska 261) to break a system is
just too eccentric relative to this cultural prejudice.
200 CHAPTER 10
There are earlier bases for the dominance of individualism in Western
traditions as well. Saint Augustine, the deeply influential moral thinker for
Judeo-Christian societies, saw human suffering as occurring not only because
of individual human fault (Pagels, 1988), but because of human
choice, the conscious, deliberate, rational choice to err. The idea of a rational
choice to err is so pervasive in Western thinking that it goes virtually
unnoticed, unquestioned, because it makes such common sense. The idea,
for example, is that pilots have a choice to take the correct runway but fail
to take it. Instead, they make the wrong choice because of attentional deficiencies
or motivational shortcomings, despite the cues that were available
and the time they had to evaluate those cues. Air-traffic controllers have a
choice to see a looming conflict, but elect to pay no attention to it because
they think their priorities should be elsewhere. After the fact, it often seems
as if people chose to err, despite all available evidence indicating they had it
wrong.
The story of Adam's original sin, and especially what Saint Augustine
made of it, reveals the same space for conscious negotiation that we retrospectively
invoke on behalf of people carrying out safety-critical work in real
conditions. Eve had a deliberative conversation with the snake on whether
to sin or not to sin, on whether to err or not to err. The allegory emphasizes
the same conscious presence of cues and incentives to not err, of warnings
to follow rules and not sin, and yet Adam and Eve elected to err anyway.
The prototypical story of error and violation and its consequences in Judeo-
Christian tradition tells of people who were equipped with the requisite intellect,
who had received the appropriate indoctrination (don't eat that
fruit), who displayed capacity for reflective judgment, and who actually had
the time to choose between a right and a wrong alternative. They then proceeded
to pick the wrong alternative, a choice that would make a big difference
for their lives and the lives of others. It is likely that, rather than causing
the fall into continued error, as Saint Augustine would have it, Adam's
original sin portrays how we think about error, and how we have thought
about it for ages. The idea of free will permeates our moral thinking, and
most probably influences how we look at human performance to this day.
MISMATCH BETWEEN AUTHORITY
AND RESPONSIBILITY
Of course this illusion of free will, though dominant in post hoc analyses of
error, is at odds with the real conditions under which people perform work:
where resource limitations and uncertainty severely constrain the choices
open to them. Van den Hoven (2001) called this "the pressure condition."
Operators such as pilots and air-traffic controllers are "narrowly embed-
201 HOLDING PEOPLE ACCOUNTABLE
ded"; they are "configured in an environment and assigned a place which
will provide them with observational or derived knowledge of relevant facts
and states of affairs" (p. 3). Such environments are exceedingly hostile to
the kind of reflection necessary to meet the regulative ideal of individual
moral responsibility. Yet this is exactly the kind of reflective idyll we read in
the story of Adam and Eve and the kind we retrospectively presume on behalf
of operators in difficult situations that led to a mishap.
Human factors refers to this as an authority-responsibility double bind:
A mismatch occurs between the responsibility expected of people to do the
right thing, and the authority given or available to them to live up to that responsibility.
Society expresses its confidence in operators' responsibility
through payments, status, symbols, and the like. Yet operators' authority
may fall short of that responsibility in many important ways. Operators typically
do not have the degrees of freedom assumed by their professional responsibility
because of a variety of reasons: Practice is driven by multiple
goals that may be incompatible (simultaneously having to achieve maximum
capacity utilization, economic aims, customer service, and safety). As
Wilkinson (1994, p. 87) remarked: "A lot of lip service is paid to the myth of
command residing in the cockpit, to the fantasy of the captain as ultimate
decision-maker. But today the commander must first consult with the accountant."
Error, then, must be understood as the result of constraints that
the world imposes on people's goal-directed behavior. As the local rationality
principle dictates, people want to do the right thing, yet features of their
work environment limit their authority to act, limit their ability to live up to
the responsibility for doing the right thing. This moved Claus Jensen
(1996) to say:
there is no longer any point in appealing to the individual worker's own sense
of responsibility, morality or decency, when almost all of us are working
within extremely large and complex systems . . . According to this perspective,
there is no point in expecting or demanding individual engineers or managers
to be moral heroes; far better to put all of one's efforts into reinforcing
safety procedures and creating structures and processes conducive to ethical
behavior, (p. xiii)
Individual authority, in other words, is constrained to the point where
moral appeals to individual responsibility are becoming useless. And authority
is not only restricted because of the larger structures that people are
only small parts of. Authority to assess, decide, and act can be in limited
simply because of the nature of the situation. Time and other resources for
making sense of a situation are lacking; information may not be at hand or
may be ambiguous; there may be all kinds of subtle organizational pressures
to prefer certain actions over others; and there may be no neutral or
additional expertise to draw on. Even Eve was initially alone with the snake.
202 CHAPTER 10
Where was Adam, the only other human available in paradise during those
critical moments of seduction into error? Only recent additions to the human
factors literature (e.g., naturalistic decision making, ecological task
analyses) explicitly took these and other constraints on people's practice
into consideration in the design and understanding of work. Free will is a
logical impossibility in cases where there is a mismatch between responsibility
and authority, which is to say that free will is always a logical impossibility
in real settings where real safety-critical work is carried out.
This should invert the culpability criterion when operators or others are
being held accountable for their errors. Today it is typically the defendant
who has to explain that he or she was constrained in ways that did not allow
adequate control over the situation. But such defenses are often hopeless.
Outsider observers are influenced by hindsight when they look back on
available data and choice moments. As a consequence, they consistently
overestimate both the clarity of the situation and the ability to control the
outcome. So rather than the defendant having to show that insufficient
data and control made the outcome inevitable, it should be up to the claimants,
or prosecution, to prove that adequate control was in fact available.
Did people have enough authority to live up to their responsibility?
Such a proposal, however, amounts to only a marginal adjustment of what
may still be dysfunctional and counterproductive accountability relationships.
What different models of responsibility could possibly replace current
accountability relationships, and do they have any chance? In the adversarial
confrontations and defensive posturing that the criminalization of error generates
today, truth becomes fragmented across multiple versions that advocate
particular agendas (staying out of jail, limiting corporate liability). This
makes learning from the mishap almost impossible. Even making safety improvements
in the wake of an accident can get construed as an admission of
liability. This robs systems of their most concrete demonstration that they
have learned something from the mishap: an actual implementation of lessons
learned. Indeed, lessons are not learned before organizations have actually
made the changes that those lessons prescribe.
BLAME-FREE CULTURES?
Ideally, there should be accountability without invoking defense mechanisms.
Blame-free cultures, for example, though free from blame and associated
protective plotting, are not without member responsibility. But blamefree
cultures are extremely rare. Examples have been found among Sherpas
in Nepal (Douglas, 1992), who pressure each other to settle quarrels peacefully
and reduce rivalries with strong informal procedures for reconciliation.
Laying blame accurately is considered much less important than a generous
203 HOLDING PEOPLE ACCOUNTABLE
treatment of the victim. Sherpas irrigate their social system with a lavish flow
of gifts, taxing themselves collectively to ensure nobody goes neglected, and
victims are not left exposed to impoverishment or discrimination (Douglas).
This mirrors the propensity of Scandinavian cultures for collective taxation
to support dense webs of social security. Prosecution of individuals or especially
civil lawsuits in the wake of accidents are rare. U.S. responses stand in
stark contrast (although criminal prosecution of operators is rare there). Despite
a plenitude of litigation (which inflates and occasionally exceeds the
compensatory expectations of a few), victims as a group are typically
undercompensated. Blame-free cultures may hinge more on consistently
generous treatment of victims than on denying that professional accountability
exists. They also hinge on finding other expressions of responsibility, of
what it means to be a responsible member of that culture.
Holding people accountable can be consistent with being blame-free if
transportation industries think in novel ways about accountability. This
would involve innovations in relationships among the various stakeholders.
Indeed, in order to continue making progress on safety, transportation industries
should reconsider and reconstruct accountability relationships
between its stakeholders (organizations, regulators, litigators, operators,
passengers). In a new form of accountability relationships, operators or
managers involved in mishaps could be held accountable by inviting them
to tell their story (their account). Such accounts can then be systematized
and distributed, and used to propagate vicarious learning for all. Microversions
of such accountability relationships have been implemented in
many incident-reporting systems, and perhaps their examples could move
industries in the direction of as yet elusive blame-free cultures.
The odds, however, may be stacked against attempts to make such progress.
The Judeo-Christian ethic of individual responsibility is not just animated
by a basic Nietzschean anxiety of losing control. Macrostructural
forces are probably at work too. There is evidence that episodes of renewed
enlightenment, such as the Scientific Revolution, are accompanied by violent
regressions toward supernaturalism and witch hunting. Prima facie,
this would be an inconsistency. How can an increasingly illuminated society
simultaneously retard into superstition and scapegoating? One answer may
lie in the uncertainties and anxieties brought on by the technological advances
and depersonalization that inevitably seem to come with such progress.
New, large, complex, and widely extended technological systems (e.g.,
global aviation that took just a few decades to expand into what it is today)
create displacement, diffusion, and causal uncertainty. A reliance on individual
culpability may be the only sure way of recapturing an illusion of control.
In contrast, less technologically or industrially developed societies
(take the Sherpas as example again) appear to rely on more benign models
of failure and blame, and more on collective responsibility.
204 CHAPTER 10
In addition, those who do safety-critical work often tie culpability conventions
to aspects of their personal biographies. Physician Atul Gawande
(2002, p. 73), for example, commented on a recent surgical incident and
observed that terms such as systems problems are part of a "dry language of
structures, not people . . . something in me, too, demands an acknowledgement
of my autonomy, which is also to say my ultimate culpability ... although
the odds were against me, it wasn't as if I had no chance of succeeding.
Good doctoring is all about making the most of the hand you're dealt,
and I failed to do so."
The expectation of being held accountable if things go wrong (and, conversely,
being responsible if things go right) appears intricately connected
to issues of self-identity, where accountability is the other side of professional
autonomy and a desire for control. This expectation can engender
considerable pride and can make even routine operational work deeply
meaningful. But although good doctoring (or any kind of practice) may be
making the most of the hand one is dealt, human factors has always been
about providing that hand more and better opportunities to do the right
thing. Merely leaving the hand with what it is dealt and banking on personal
motivation to do the rest takes us back to prehistoric times, when behaviorism
reigned and human factors had yet to make its entry in system
safety thinking.
Accountability and culpability are deeply complex concepts. Disentangling
their prerational influences in order to promote systems thinking,
and to create an objectively fairer, blame-free culture, may be an uphill
struggle. They are, in any case, topics worthy of more research.
References
Aeronautica Civil. (1996). Aircraft accident report: Controlled flight into terrain, American Airlines
flight 965, Boeing 757-223, N651AA nearCali, Colombia, December 20, 1995. Bogota, Colombia:
Author.
Airliner World. (2001, November). Excel, pp. 77-80.
Air Transport Association of America. (1989, April). National plan to enhance aviation safety
through human factors improvements. Washington, DC: Author.
Albright, C. A., Truitt, T. R., Barile, A. B., Vortac, O. U., & Manning, C. A. (1996). How controllers
compensate for the lack of flight progress strips (Final Rep. No. DOT/FAA/AM-96/5).
Arlington, VA: National Technical Information Service.
Amalberti, R. (2001). The paradoxes of almost totally safe transportation systems. Safety Science,
37, 109-126.
Angell, I. O., & Straub, B. (1999). Rain-dancing with pseudo-science. Cognition, Technology and
Work, 1, 179-196.
Baiada, R. M. (1995). ATC biggest drag on airline productivity. Aviation Week and Space Technology,
31, 51-53.
Bainbridge, L. (1987). Ironies of automation. InJ. Rasmussen, K. Duncan, &J. Leplat (Eds.),
New technology and human error (pp. 271-283). Chichester, England: Wiley.
Batteau, A. W. (2001). The anthropology of aviation and flight safety. Human Organization,
60(3), 201-210.
Beyer, H., & Holtzblatt, K. (1998). Contextual design: Defining customer-centered systems. San
Diego, CA: Academic Press.
Billings, C. E. (1996). Situation awareness measurement and analysis: A commentary. In D. J.
Garland & M. R. Endsley (Eds.), Experimental analysis and measurement of situation awareness
(pp. 1-5). Daytona Beach, FL: Embry-Riddle Aeronautical University Press.
Billings, C. E. (1997). Aviation automation: The search for a human-centered approach. Mahwah, NJ:
Lawrence Erlbaum Associates.
Bjorklund, C., Alfredsson, J., & Dekker, S. W. A. (2003). Shared mode awareness of the FMA in
commercial aviation: An eye-point of gaze and communication data analysis in a highfidelity
simulator. In E. Hollnagel (Ed.), Proceedings ofEAM 2003, The 22ndEuropean Confer205
206 REFERENCES
ence on Human Decision Making and Manual Control (pp. 119—126). Linkoping, Sweden: Cognitive
Systems Engineering Laboratory, Linkoping University.
Boeing Commercial Airplane Group. (1996). Boeing submission to the American Airlines Flight 965
Accident Investigation Board. Seattle, WA: Author.
Bruner, J. (1990). Acts of meaning. Cambridge, MA: Harvard University Press.
Campbell, R. D., & Bagshaw, M. (1991). Human performance and limitations in aviation. Oxford,
England: Blackwell Science.
Capra, F. (1982). The turning point. New York: Simon & Schuster.
Carley, W. M. (1999, January 21). Swissair pilots differed on how to avoid crash. The Wall Street
Journal.
Columbia Accident Investigation Board. (2003). Report Volume 1, August 2003. Washington,
DC: U.S. Government Printing Office.
Cordesman, A. H., & Wagner, A. R. (1996). The lessons of modern war: Vol. 4. The Gulf War. Boulder,
CO: Westview Press.
Croft, J. (2001, July 16). Researchers perfect new ways to monitor pilot performance. Aviation
Week and Space Technology, pp. 76-77.
Dawkins, R. (1986). The blind watchmaker. London: Penguin.
Degani, A., Heymann, M., & Shafto, M. (1999). Formal aspects of procedures: The problem of
sequential correctness. In Proceedings of the 43rd Annual Meeting of the Human Factors and Ergonomics
Society. Houston, TX: Human Factors Society.
Dekker, S. W. A. (2002). The field guide to human error investigations. Bedford, England:
Cranfield University Press.
Dekker, S. W. A., & Woods, D. D. (1999). To intervene or not to intervene: The dilemma of
management by exception. Cognition, Technology and Work, 1, 86-96.
Delia Rocco, P. S., Manning, C. A., & Wing, H. (1990). Selection of air traffic controllers for automated
systems: Applications from current research (DOT/FAA/AM-90/13). Arlington, VA: National
Technical Information Service.
Dorner, D. (1989). The logic of failure: Recognizing and avoiding error in complex situations. Cambridge,
MA: Perseus Books.
Douglas, M. (1992). Risk and blame: Essays in cultural theory. London: Routledge.
Endsley, M. R., Mogford, M., Allendoerfer, K., & Stein, E. (1997). Effect of free flight conditions on
controller performance, workload and situation awareness: A preliminary investigation of changes in
locus of control using existing technologies. Lubbock, TX: Texas Technical University.
Feyerabend, P. (1993). Against method (3rd ed.). London: Verso.
Feynman, R. P. (1988). "What do you care what other people think?": Further adventures of a curious
character. New York: Norton.
Fischoff, B. (1975). Hindsight is not foresight: The effect of outcome knowledge on judgement
under uncertainty. Journal of Experimental Psychology: Human Perception and Performance,
7(3), 288-299.
Fitts, P. M. (1951). Human engineering for an effective air navigation and traffic control system. Washington,
DC: National Research Council.
Fitts, P. M., & Jones, R. E. (1947). Analysis of factors contributing to 460 "pilot error" experiences in
operating aircraft controls (Memorandum Rep. No. TSEAA-694-12). Dayton, OH: Aero Medical
Laboratory, Air Material Command, Wright-Patterson Air Force Base.
Flores, F., Graves, M., Hartfield, B., & Winograd, T. (1988). Computer systems and the design
of organizational interaction. ACM Transactions on Office Information Systems, 6, 153-172.
Galison, P. (2000). An accident of history. In P. Galison & A. Roland (Eds.), Atmospheric /light in
the twentieth century (pp. 3-44). Dordrecht, The Netherlands: Kluwer Academic.
Galster, S. M., Duley, J. A., Masolanis, A. J., & Parasuraman, R. (1999). Effects of aircraft selfseparation
on controller conflict detection and workload in mature Free Flight. In M. W.
REFERENCES 207
Scerbo & M. Mouloua (Eds.), Automationtechnology and human performance: Current research
and trends (pp. 96-101). Mahwah, NJ: Lawrence Erlbaum Associates.
Gawande, A. (2002). Complications: A surgeon's notes on an imperfect science. New York: Picado.
Geertz, C. (1973). The interpretation of cultures. New York: Basic Books.
Golden-Biddle, K., & Locke, K. (1993). Appealing work: An investigation of how ethnographic
texts convince. Organization Science, 4, 595-616.
Heft, H. (2001). Ecological psychology in context:James Gibson, Roger Barker, and the legacy of William
James's radical empiricism. Mahwah, NJ: Lawrence Erlbaum Associates.
Helmreich, R. L. (2000). On error management: Lessons from aviation. British Medical Journal,
320, 745-753.
Helmreich, R. L., Klinect, J. R., & Wilhelm, J. A. (1999). Models of threat, error and response
in flight operations. In R. S.Jensen (Ed.), Proceedings of the 10th International Symposium on
Aviation Psychology. Columbus: The Ohio State University.
Hollan, J., Hutchins, E., & Kirsh, D. (2000). Distributed cognition: Toward a new foundation
for human-computer interaction research. ACM Transactions on Computer-Human Interaction,
7(2), 174-196.
Hollnagel, E. (1999). From function allocation to function congruence. In S. W. A. Dekker &
E. Hollnagel (Eds.), Coping with computers in the cockpit (pp. 29-53). Aldershot, England:
Ashgate.
Hollnagel, E. (Ed.). (2003). Handbook of cognitive task design. Mahwah, NJ: Lawrence Erlbaum
Associates.
Hollnagel, E., & Amalberti, R. (2001). The emperor's new clothes: Or whatever happened to
"human error"? In S. W. A. Dekker (Ed.), Proceedings of the 4th International Workshop on Human
Error, Safety and Systems Development (pp. 1-18). Linkoping, Sweden: Linkoping University.
Hollnagel, E., & Woods, D. D. (1983). Cognitive systems engineering: New wine in new bottles.
International Journal of Man-Machine Studies, 18, 583-600.
Hughes, J. A., Randall, D., & Shapiro, D. (1993). From ethnographic record to system design:
Some experiences from the field. Computer Supported Collaborative Work, 1, 123—141.
International Civil Aviation Organization. (1998). Human factors training manual (ICAO Doc.
No. 9683-AN/950). Montreal, Quebec: Author.
Jensen, C. (1996). No downlink: A dramatic narrative about the Challenger accident and our time.
New York: Farrar, Strauss, Giroux.
Joint Aviation Authorities. (2001). Human factors in maintenance working group report.
Hoofddorp, The Netherlands: Author.
Joint Aviation Authorities. (2003). Advisory CircularJoint ACJ 25.1329: Flight guidance system, Attachment
1 to NPA (Notice of Proposed Amendment) 25F-344. Hoofddorp, The Netherlands: Author.
Kern, T. (1998). Flight discipline. New York: McGraw-Hill.
Klein, G. A. (1998). Sources of power: How people make decisions. Cambridge, MA: MIT Press.
Kohn, L. T., Corrigan, J. M., & Donaldson, M. (Eds.). (1999). To err is human: Building a safer
health system. Washington, DC: Institute of Medicine.
Kuhn, T. S. (1962). The structure of scientific revolutions. Chicago: University of Chicago Press.
Langewiesche, W. (1998). Inside the sky: A meditation on flight. New York: Pantheon Books.
Lanir, Z. (1986). Fundamental surprise. Eugene, OR: Decision Research.
Lautman, L., & Gallimore, P. L. (1987). Control of the crew caused accident: Results of a 12operator
survey. Boeing Airliner, April-June, 1-6.
Lerner, J. S., & Tetlock, P. E. (1999). Accounting for the effects of accountability. Psychological
Bulletin, 125, 255-275.
208 REFERENCES
Leveson, N. (2002). A new approach to system safety engineering. Cambridge, MA: Aeronautics and
Astronautics, Massachusetts Institute of Technology.
Mackay, W. E. (2000). Is paper safer? The role of paper flight strips in air traffic control. ACM/
Transactions on Computer-Human Interactions, 6, 311-340.
McDonald, N., Corrigan, S., & Ward, M. (2002, June). Well-intentioned people in dysfunctional systems.
Keynote paper presented at the 5th Workshop on Human Error, Safety and Systems
Development, Newcastle, Australia.
Meister, D. (2003). The editor's comments. Human Factors Ergonomics Society COTG Digest, 5,
2-6.
Metzger, U., & Parasuraman, R. (1999). Free Flight and the air traffic controller: Active control
versus passive monitoring. In Proceedings of the Human Factors and Ergonomics Society 43rd
annual meeting. Houston, TX: Human Factors Society.
Mumaw, R. J., Sarter, N. B., & Wickens, C. D. (2001). Analysis of pilots' monitoring and performance
on an automated flight deck. In Proceedings of 11th International Symposium in Aviation
Psychology. Columbus: Ohio State University.
National Aeronautics and Space Administration. (2000, March). Report onproject management in
NASA, by the Mars Climate Orbiter Mishap Investigation Board. Washington, DC: Author.
National Transportation Safety Board. (1974). Delta Air Lines Douglas DC-9-31, Boston, MA,
7/31/73 (NTSB Rep. No. AAR-74/03). Washington, DC: Author.
National Transportation Safety Board. (1995). Aircraft accident report: Flight into terrain during
missed approach, USAir flight 1016, DC-9-31, N954VJ, Charlotte Douglas International Airport,
Charlotte, North Carolina, July 2, 1994 (NTSB Rep. No. AAR-95/03). Washington, DC: Author.
National Transportation Safety Board. (1997). Grounding of thePanamanian passenger shipRoya
Majesty on Rose and Crown shoal near Nantucket, Massachusetts, June 10, 1995 (NTSB Rep. No.
MAR-97/01). Washington, DC: Author.
National Transportation Safety Board. (2002). Loss of control and impact with Pacific Ocean,
Alaska Airlines Flight 261 McDonnell Douglas MD-83, N963AS, about 2.7 miles north ofAnacapa
Island, California, January 31, 2000 (NTSB Rep. No. AAR-02/01). Washington, DC: Author.
Neisser, U. (1976). Cognition and reality: Principles and implications of cognitive psychology. San
Francisco: Freeman Press.
North, D. M. (2000, May 15). Letjudicial system run its course in crash cases. Aviation Week and
Space Technology, p. 66.
North, D. M. (2002, February 4). Oil and water, cats and dogs. Aviation Week and Space Technology,
p. 70.
O'Hare, D., & Roscoe, S. (1990). Flightdeck performance: The human factor. Ames: Iowa State University
Press.
Orasanu, J. M. (2001). The role of risk assessment in flight safety: Strategies for enhancing pilot
decision making. In Proceedings of the 4th International Workshop on Human Error, Safety and
Systems Development (pp. 83-94). Linkoping, Sweden: Linkoping University.
Orasanu, J. M., & Connolly, T. (1993). The reinvention of decision making. In G. A. Klein, J.
Orasanu, R. Calderwood, & C. E. Zsambok (Eds.), Decision making in action: Models and methods
(pp. 3-20). Norwood, NJ: Ablex.
Pagels, E. (1988). Adam, Eve and the serpent. London: Weidenfeld & Nicolson.
Parasuraman, R., Molly, R., & Singh, I. (1993). Performance consequences of automationinduced
complacency. The International Journal of Aviation Psychology, 3(1) , 1-23.
Parasuraman, R., Sheridan, T. B., & Wickens, C. D. (2000). A model for types and levels of human
interaction with automation. IEEE transactions on systems, man, and cybernetics—
Part A: Systems and Humans. Systems and Humans, 30, 286-297.
Perrow, C. (1984). Normal accidents: Living with high-risk technologies. New York: Basic Books.
Rasmussen, J., & Svedung, I. (2000). Proactive risk management in a dynamic society. Karlstad, Sweden:
Swedish Rescue Services Agency.
REFERENCES 209
Reason, J. T. (1990). Human error. Cambridge, England: Cambridge University Press.
Reason, J. T., & Hobbs, A. (2003). Managing maintenance error: A practical guide. Aldershot, England:
Ashgate.
Rochlin, G. I. (1999). Safe operation as a social construct. Ergonomics, 42, 1549-1560.
Rochlin, G. I., LaPorte, T. R., & Roberts, K. H. (1987). The self-designing high-reliability organization:
Aircraft carrier flight operations at sea. Naval War College Review, Autumn 1987.
Ross, G. (1995). Flight strip survey report. Canberra, Australia: TAAATS TOI.
Sacks, O. (1998). The man who mistook his wife for a hat. New York: Touchstone.
Sanders, M. S., & McCormick, E. J. (1997). Human factors in engineering and design (7th ed.).
New York: McGraw-Hill.
Sarter, N. B., & Woods, D. D. (1997). Teamplay with a powerful and independent agent: A corpus
of operational experiences and automation surprises on the Airbus A320. Human Factors,
39, 553-569.
Shappell, S. A., & Wiegmann, D. A. (2001). Applying reason: The human factors analysis and
classification system (HFACS). Human Factors and Aerospace Safety, 1, 59-86.
Singer, G., & Dekker, S. W. A. (2000). Pilot performance during multiple failures: An empirical
study of different warning systems. Journal of Transportation Human Factors, 2, 63-76.
Smith, K. (2001). Incompatible goals, uncertain information and conflicting incentives: The
dispatch dilemma. Human Factors and Aerospace Safety, 1, 361—380.
Snook, S. A. (2000). Friendly fire: The accidental shootdown of US Black Hawks over Northern Iraq.
Princeton, NJ: Princeton University Press.
Starbuck, W. H., & Milliken, F. J. (1988). Challenger: Fine-tuning the odds until something
breaks. Journal of Management Studies, 25, 319-340.
Statens Haverikommision [Swedish Accident Investigation Board]. (2000). Tillbud vid
landning med flygplanet LN-RLF den 23/6 pa Vaxjo/Kronoberg flygplats, G Ian (Rapport RL
2000:38) [Incident during landing with aircraft LN-RLF on June 23 at Vaxjo/Kronoberg
airport]. Stockholm, Sweden: Author.
Statens Haverikommision [Swedish Accident Investigation Board]. (2003). Tillbud mellan
flygplanet LN-RPL och en bogsertraktor pa Stockholm/Arlanda flygplats, AB Ian, den 27 oktober
2002 (Rapport RL 2003:47) [Incident between aircraft LN-RPL and a tow-truck at Stockholm/
Arlanda airport, October 27, 2002]. Stockholm, Sweden: Author.
Suchman, L. A. (1987). Plans and situated actions: The problem of human-machine communication.
Cambridge, England: Cambridge University Press.
Tuchman, B. W. (1981). Practicing history: Selected essays. New York: Norton.
Turner, B. (1978). Man-made disasters. London: Wykeham.
Varela, F. J., Thompson, E., & Rosch, E. (1991). The embodied mind: Cognitive science and human
experience. Cambridge, MA: MIT Press.
Vaughan, D. (1996). The Challenger lauch decision: Risky technology, culture and deviance at NASA.
Chicago: University of Chicago Press.
Vaughan, D. (1999). The dark side of organizations: Mistake, misconduct, and disaster. Annual
Review of Sociology, 25, 271—305.
van den Hoven, M. J. (2001). Moral responsibility and information technology. Rotterdam, The
Netherlands: Erasmus University Center for Philosophy of ICT.
Vicente, K. (1999). Cognitive work analysis: Toward safe, productive, and healthy computer-based
work. Mahwah, NJ: Lawrence Erlbaum Associates.
Weick, K. E. (1993). The collapse of sensemaking in organizations. Administrative Science Quarterly,
38, 628-652.
Weick, K. E. (1995). Sensemaking in organizations. London: Sage.
Weingart, P. (1991). Large technical systems, real life experiments, and the legitimation trap
of technology assessment: The contribution of science and technology to constituting risk
perception. In T. R. LaPorte (Ed.), Social responses to large technialsystems: Control or anticipation
(pp. 8-9). Amsterdam: Kluwer.
210 REFERENCES
Wiener, E. L. (1988). Cockpit automation. In E. L. Wiener & D. C. Nagel (Eds.), Humanf actors
in aviation (pp. 433—462). San Diego, CA: Academic Press.
Wilkinson, S. (1994, February-March). The Oscar November incident. Air & Space, 80-87.
Woods, D. D. (1993). Process-tracing methods for the study of cognition outside of the experimental
laboratory. In G. A. Klein,J. Orasanu, R. Calderwood, & C. E. Zsambok (Eds.),Decision
making in action: Models and methods (pp. 228-251). Norwood, NJ: Ablex.
Woods, D. D. (2003, October 29). Creating foresight: How resilience engineering can transform
NASA's approach to risky decision making. Hearing before the U.S. Senate Committee on
Commerce, Science and Transportation, John McCain, chair, Washington, DC.
Woods, D. D., & Dekker, S. W. A. (2001). Anticipating the effects of technology change: A new
era of dynamics for Human Factors. Theoretical Issues in Ergonomics Science, 1, 272-282.
Woods, D. D.,Johannesen, L. J., Cook, R. L, & Sarter, N. B. (1994). Behind human error: Cognitive
systems, computers and hindsight. Dayton, OH: CSERIAC.
Woods, D. D., Patterson, E. S., & Roth, E. M. (2002). Can we ever escape from data overload? A
cognitive systems diagnosis. Cognition, Technology, and Work, 4, 22—36.
Woods, D. D., & Shattuck, L. G. (2000). Distant supervision: Local action given the potential
for surprise. Cognition Technology and Work, 2(4), 242-245.
Wright, P. C., & McCarthy, J. (2003). Analysis of procedure following as concerned work. In E.
Hollnagel (Ed.), Handbook of cognitive task design (pp. 679-700). Mahwah, NJ: Lawrence
Erlbaum Associates.
Wynne, B. (1988). Unruly technology: Practical rules, impractical discourses, and public understanding.
Social Studies of Sciences, 18, 147-167.
Xiao, Y., & Vicente, K. J. (2000). A framework for epistemological analysis in empirical (laboratory
and field) studies. Human Factors, 42, 87-101.
Yerkes, R. M., & Dodson, J. D. (1908). The relation of strength of stimulus to rapidity of habitformation.
Journal of Comparative and Neurological Psychology, 18, 459—482.
Author Index
A Capra, F., 10, 29, 51
Carley, W. M., 136, 139
Aeronautica Civil, 68, 69, 72, 129, 131 Columbia Accident Investigation Board
Airliner World, 146 (CAIB), 40, 41, 42
Air Transport Association of America Connolly, T., 79
(ATA), 151 Cordesman, A. H., 163
Albright, C. A., 174, 178-184 Corrigan, J. M., 46
Alfredsson, J., 158, 159, 160, 161 Corrigan, S., 132, 135, 137, 141, 143, 147,
Allendorfer, K., 165 148
Amalberti, R., 17, 28, 52, 53, 54, 61 Croft, J., 47, 50, 53, 59
Angell, I. O., 50, 60, 61, 85
D
B
Dawkins, R., 187, 188
Bagshaw, M., 127 Degani, A., 11
Baiada, R. M., 167 Dekker, S. W. A., 14, 87, 154, 155, 158,
Bainbridge, L., 162, 165 159, 160, 161, 163, 165, 166, 167
Barile, A. B., 174, 178-184 Delia Rocco, P. S., 188, 189
Batteau, A. W., 179, 190 Dodson,J. D., 130
Beyer, H., 188 Donaldson, M., 46
Billings, C. E., 125, 155 Dorner, D., 143-144, 146
Bjorklund, C., 158, 159, 160, 161 Douglas, M., 197, 198, 202-203
Boeing Commercial Airplane Group, 68 Duley.J. A., 165
Bruner.J., 105, 106
C E
Campbell, R. D., 127 Endsley, M. R., 165
211
212
F
Feyerabend, P., 54-55, 56, 58, 59, 65
Feynman, R. P., 35, 39, 41
Fischoff, B., 68, 82
Fitts, P. M., 161
Flores, F., 163
G
Galison, P., 198-199
Gallimore, P. L., 132, 140
Galster, S. M., 165
Gawande, A., 204
Geertz, C., 190
Golden-Riddle, K., 180, 184, 190
Graves, M., 163
H
Hartfield, B., 163
Heft, H., 34, 108, 110, 111, 199
Helmreich, R. L., 49
Heymann, M., 11
Hobbs, A., 58
Hollan,J., 189, 190
Hollnagel, E., 52, 53, 54, 61, 157, 161, 162
Holtzblatt, K., 188
Hughes, J. A., 174, 185, 186, 187, 188
Hutchins, E., 189, 190
I
International Civil Aviation Organization
(ICAO), 54
J
Jensen, C., 36, 201
Joint Aviation Authorities, 134, 160
K
Kern, T., 125, 127
Kirsh, D., 189, 190
AUTHOR INDEX
Klein, G. A., 79, 169
Kohn, L. T., 46
Kuhn, T. S., 47, 50, 56
L
Langewiesche, W., 26
Lanir, Z., 86
LaPorte, T. R., 135
Lautman, L., 132, 140
Lerner,J. S., 196
Leveson, N., 4, 16, 24, 34, 35, 36
Locke, K., 180, 184, 190
M
Mackay, W. E., 183
Manning, C. A., 174, 178-184, 188, 189
Masolanis, A. J., 165
McCarthy, J., 136, 138, 150
McCormick, E. J., 175
McDonald, N., 132, 135, 137, 141, 143,
147, 148
Metzger, U., 165
Milliken, F.J., 25, 36, 81, 144, 149
Mogford, M., 165
Molly, R., 127
Mumaw, R.J., 160
N
National Aeronautics and Space Administration
(NASA), 144
National Transportation Safety Board
(NTSB), 22, 25, 32, 38, 44, 70, 72,
91, 118, 130, 136, 137, 173
Neisser, U., 79, 111
North, D. M., 195, 196, 199
O
O'Hare, D., 127
Orasanu,J. M., 61, 79
213 AUTHOR INDEX
P
Pagels, E., 200
Parasuraman, R., 127, 161, 162, 165
Patterson, E. S., 108, 152, 156
Perrow, C., 14, 24, 36, 88
R
Randall, D., 174, 185, 186, 187, 188
Rasmussen,J., 24, 28, 38
Reason, J. T., 58, 68, 77
Roberts, K. H., 135
Rochlin, G. I., 61, 62, 63, 135, 141
Rosch, E., 51
Roscoe, S., 127
Ross, G., 188, 189, 190
Roth, E. M., 108, 152, 156
s
Sacks, O., 87
Sanders, M. S., 175
Sarter, N. B., 160, 169
Shafto, M., 11
Shapiro, D., 174, 185, 186, 187, 188
Shappell, S. A., 49
Shattuck, L. G., 140
Sheridan, T. B., 161, 162
Singer, G., 154, 155
Singh, L, 127
Smith, K., 144
Snook, S. A., 10, 24, 60, 80-81, 83, 84, 133,
149, 198
Starbuck, W. H., 25, 36, 81, 144, 149
Statens Haverikommision, 6, 96, 97, 98
Stein, E., 165
Straub, B., 50, 60, 61, 85
Suchman, L. A., 136, 138
Svedung, I., 24, 28, 38
T
Tetlock, P. E., 196
Thompson, E., 51
Truitt, T. R., 174, 178-184
Tuchman, B. W., 74
Turner, B., 23
V
Van den Hoven, M. J., 41, 200
Varela, F. J., 51
Vaughan, D., 24, 30-31, 36, 39, 44, 54, 75,
79, 86, 147, 148, 149, 179, 190, 199
Vicente, K.J., 75, 157, 176, 180, 186, 188
Vortac, O. U., 174, 178-184
W
Wagner, A. R., 163
Ward, M., 132, 135, 137, 141, 143, 147, 148
Weick, K. E., 24, 26, 37, 68, 89, 94, 110,
111, 121-122, 136
Weingart, P., 33
Wickens, C. D., 160, 161, 162
Wiegmann, D. A., 49
Wiener, E. L., 126
Wilkinson, S., 197, 201
Wing, H., 188, 189
Winograd, T., 163
Woods, D. D., 23, 62, 63, 86, 108, 140, 152,
156, 162, 163, 165, 166, 167, 169,
176, 186, 189
Wright, P. C., 136, 138, 150
Wynne, B., 33
X
Xiao, Y., 176, 180, 186, 188
Y
Yerkes, R. M., 130
This page intentionally left blank
Subject Index
Note: Figures in italics refer to figures.
A
Accident rates
in doctors versus gun-owners, 46
in transportation systems, 17
Accountability, 195-196, 202-204
Adaptation, to automation, 157-161,
163-164, 172
Airbus test flight crash, 155-156
Aircraft accidents and incidents
Airbus test flight crash, 155-156
Alaska 261, 18-24, 32-33, 37, 136-137
Cali, Colombia, crash, 68-70
Halifax crash, 139-140
Heathrow airport incident, 197
helicopter crashes, post Gulf War, 83-84,
198
Logan Airport crash, 129-130
runway overrun, 5—7, 10-14
taxiway incursion, 95-99, 96, 97, 98
TWA 800, 2-3
Valujet, 63-64, 195
Washington National crash, 131
Aircraft maintenance
goal conflicts in, 148
procedure versus practice in, 134—138
Air-traffic control
error study in, 52-54
exception management in, 165-168
flight-progress strips in
certifying, 171-172, 772, 174
studies of, 178-191
Air Transportation Oversight System
(ATOS), 24-25
Alaska 261 crash
drift into failure in, 18-24, 21, 23
hindsight in, 37
procedure versus practice in, 136-137
production and operation gap in, 32-33
Argument from personal incredulity, 187
Association, 100
ATOS, 24-25
Authority-responsibility double bind,
200-202
Automation. See also Technology
adaptation to, 157-161, 163-164, 172
data overload in, 152-157
as error solution, 151-152
exception management and, 164-168
strengths and weaknesses in, 161—163
as team player, 169-170
Autonomy principle, 55, 94
B
Banality-of-accidents thesis, 24-25, 27-28,
30-31
215
216
Behaviorism, 104—105
Belief, 36, 145
Berkeley, George, 101-102
Black Hawk helicopter crashes, 83-84, 198
Blame, 196-200
Blame-free cultures, 202-204
Bulletized presentations, 38-43
C
Cali, Colombia crash, 68-70
Choice, in error, 200, 202
Cockpit voice recording, 73-74
Cognitive revolution, 105
Common-cause hypothesis, 28
Competition, 24-25, 42-43
Complacency, as folk model, 126. See also
Folk models
Context, in error classification, 60-61
Control models, 34-35
Correspondence, in situational awareness,
93-95
Cost effectiveness, procedure violations
and, 149
Counterfactual reasoning, 70-71
Counterinstances, 56
Crew Resource Management (CRM), 6-7,
131
Criminalization of errors, 194-195
CRM, 6-7, 131
D
Data overload
as clutter problem, 155-156
as workload bottleneck, 152-155
Decision making
local rationality in, 38-43, 77-79
versus sensemaking, 79—81
Deconstruction, 2-3
Descartes, Rene, 7-10
Distancing through differencing, 63
Doctors, deaths caused by, 46
Drift into failure
in Alaska 261 accident, 18-24, 21, 23
described, 18
local rationality in, 38-43
modeling, 33-35
organizational resilience and, 43-45
SUBJECT INDEX
reasons for, 24-30
systems thinking in, 31-33, 35-37
Dualism
development of, 7—10
in human factors vocabulary, 3
in mechanical failure versus human error,
5-7
in runway overrun case, 10-14
E
Efficiency, procedure violations and, 149
Elements, in situational awareness, 100-104
Emic versus etic perspective, 74—77
Empiricism, 100—104. See also Radical empiricism
End-play check drift, 22-23, 23
Error counts and classification
inadequacies of, 85-86
observer influence in, 53-54
as safety measurement, 64
safety progress through, 60-61
types of, 47
underlying theories of
alternative theory, 56-59
ontological relativism, 52-56, 65-66,
76
postmodernism, 50-51
realism, 48-50, 54, 58
Errors
automation and, 141-152 (See also Automation)
definitions of, in classification, 49
as ex post facto constructs, 67—70
free will and, 200, 202
mechanical versus human, 5-7
old and new views of, 14-16
punishment and criminalization of,
193-195
Excel airlines, 146
Exception management, 164—168
Ex post facto constructs, 67
F
Failure, as cause of failure, 5-7, 17. See also
Drift into failure
Falsification, 128-130
FDRs, 74
217 SUBJECT INDEX
Fine-tuning, in drift into failure, 25-26
Fitness for duty, 3
Flight-data recorders (FDRs), 74
Flight-mode annunciator (FMA) study,
158-161, 159
Flight-progress strips
certifying, 171-172, 172, 174
qualitative study of, 184-191
quantitative study of, 178-184
FMA study, 158-161, 159
Folk models
complacency as, 126
defined by substitution, 126-128
immunity to falsification, 128—130
versus new models, 129—130
overgeneralization of, 130-131
situational awareness as, 124—126
Free will, 200, 202
Function allocation by substitution, 162
Fundamental surprise error, 86-89
Future incident studies, 164-168
Future predictions, in certification,
173-174, 191-192
G
Gestalt movement, 107-108
Goal conflicts, and procedural violations,
143-148
Gun-owner errors, 46
H
Halifax crash, 139-140
Heathrow airport incident, 197
Helicopter crashes, post Gulf War, 83-84,
198
Hindsight bias
and errors as ex post facto, 67-70
as forward looking, 82—86
versus insight, 36-37
in situational awareness, 90-93, 112
Horizontal stabilizer, on MD-80, 19, 19. See
also Jackscrew, in MD-80
Human error. See Error
I
Immunization, defined, 128
Incident reporting, 27-28, 194
Incrementalism, in drift into failure, 26-27
Individualism, 199-200
Informal work systems, versus procedures,
134-135, 143
Information environments
in accident analysis, 73-74
in decision-making, 39—43
Information processing, 104-110
Insight
versus hindsight, 36-37
labels mistaken for, 124
Inspectors, procedure violations and,
141-143
Introspection, in situational awareness,
102-104
J
Jackscrew, in MD-80
drift into failure of, 18-24, 21, 23
procedure versus practice in, 136-137
production and operation gap in, 32-33
safety certification of, 173-174
James, William, 109-110
Judgmentalism, 71-73
K
Kepler, Johannes, 100
L
Labels, mistaken for insight, 124
Lancaster University project, 185-191
Language. See Vocabulary, in human factors
Learning
in drift into failure, 25-26
in hindsight bias, 84
versus punishment, 193-194, 202
Local rationality
context in, 61
in decision making, 38-43, 77-79
Logan Airport crash, 129-130
Lubrication schedule drift, 20-21, 21
218
M
MABA-MABA, 161-163
Mechanical failure, versus human error,
5-7
Medical errors, 46
N
NASA
goal conflicts in, 144-145
information environments in, 39-43
resource scarcity and, 24
Naturalistic decision making (NDM), 79-80
Navagational incidents, 90-91, 91, 112-122
NDM, 79-80
Neisser, Ulrich, 108
Newall's catch, 164
Normativism, 92-93, 92
o
Observers
in error classification, 53-54
insider versus outsider, 74—77
Ontological relativism, 52-56, 65-66, 76
Organizational resilience, 43—45
Overgeneralization, 130-131
P
Positivism, 48
Postmodernism, 50-51
PowerPoint presentations, 38-43
Pressure, internalization of, 146-148
Pressure condition, 200
Procedures
application of
models of, 133-134, 139
versus practice, 71-73, 134-138, 150
regulator's role in, 141-143
in unusual situations, 139-141
errors or violations of
as accident causes, 132-133
goal conflicts and, 143-148
normalization of, 149-150
versus proficiency errors, 47-48
SUBJECT INDEX
Proficiency errors, versus procedure errors,
47-48
Psychologist's fallacy, 73
Punishment, for errors, 193-195
Q
Qualitative research
flight-strip study example, 184-191
versus quantitative, 174—178
Quality management, versus safety management,
44-45
Quantitative research
flight-strip study example, 178-184
versus qualitative, 174—178
R
Radical empiricism, 110-112
Rationalism, 77. See also Local rationality
Realism, 48-50, 54, 58
Regulators, procedure violations and,
141-143
Relativism, 65—66
Research methods
in error classification, 49—50
qualitative example, 184-191
quantitative example, 178-184
quantitative versus qualitative, 174-178
Res extensa, 8, 14-16, 107
Resource scarcity, 24-25, 42-43
Responsibility-authority double bind,
200-202
Reverse engineering, 2-3
Royal Majesty cruise ship, 90-91, 91,
112-122
Rules. See Procedures
Runway overrun case, 5-7, 10-14
s
Safety
erosion of, 35—37
error classification and, 60-61
measuring, 62-64
progress on, 193
versus quality management, 44—45
as reflexive, 61—62
success as, 62-63, 149
219 SUBJECT INDEX
Safety certification
future predictions in, 173-174 191-192
limits of, 172-174
research methods in
qualitative example, 184-191
quantitative example, 178-184
quantitative versus qualitative, 174-178
Scientific revolution, 7-10
Seawall crash, at Logan Airport, 129-130
Sensemaking
versus decision making, 79-81
in situational awareness, 112-113,
121-122
Sherpa culture, 202-203
Situational awareness
case studies in
navigational incident, 90-91, 91,
112-122
taxiway incursion, 95-99, 96, 97, 98
as folk model, 124-126 (See also Folk
models)
hindsight and, 90-93, 112
normative characterization of, 92—93, 92
theories of
correspondence, 93-95
empiricism, 100-104
information processing, 104-110
radical empiricism, 110-112
Space Shuttle accidents
goal conflicts and, 144-145
information environments and, 38-43,
40
Spoilers, in runway overrun case, 5—7,
10-14
Structuralism, 4, 15-16, 29-30
Substitution
in folk models, 126-128
of machines for people, 162, 169 (See
also Automation)
Success, as safety, 62-63, 149
Systems thinking, 31-33, 35-37
T
Tailplane, on MD-80, 19, 19. See also
Jackscrew, in MD-80
Taxiway incursion case, 95—99, 96, 97, 98
Technology, 73-74, 85, 203. See also Auto
mation
Threshold-crossing alarms, 167-168
Transportation accident rates, 17
Trim system, in MD-80, 19-20, 20. See also
Jackscrew, in MD-80
TWA 800 crash, 2-3
V
Valujet accident, 63-64, 195
Vocabulary, in human factors
characteristics of, 2-5
limitations of, 1-2, 10
Voice trace, 73-74
w
Warning systems, and data overload,
153-156
Washington National crash, 131
Watson, John, 104-105
Wear, in certification, 172-174
Wertheimer, Max, 107-108
Wundt, Wilhelm, 102-104
Wurzburg school, 103
 libro error-humano-dekker
TEN QUESTIONS
ABOUT HUMAN ERROR: A new view of human factors and system safety
Contenidos
Reconocimientos
Prefacio
Introducción de la serie
Nota del autor
1 ¿Fue Falla Mecánica o Error Humano?
2 ¿Por qué Fallan Sistemas Seguros?
3 ¿Por qué los Doctores son más Peligrosos que los Propietarios de Armas?
4 ¿No Existen los Errores?
5 Si Ud. Pierde la Conciencia Situacional, ¿Qué la Reemplaza?
6 ¿Por qué los Operadores se vuelven Complacientes?
7 ¿Por qué no siguen ellos los Procedimientos?
8 ¿Podemos Automatizar los Errores Humanos Fuera del Sistema?
9 ¿Va a ser Seguro el Sistema?
10 ¿Debemos Hacer a la Gente Responsable por sus Errores?
Referencias
Índice del Autor
Índice por Objetivo
Reconocimientos.
Tal como los errores, las ideas vienen de algún lado. Las ideas en este libro
fueron desarrolladas en un período de años en que las discusiones con las
siguientes personas fueron particularmente constructivas: David Woods, Erik
Hollnagel, Nancy Leveson, James Nyce, John Flach, Gary Klein, Diane
Vaughan, y Charles Billings. Jens Rasmussen ha estado siempre delante en el
juego en ciertas formas:
Algunas de las preguntas sobre el error humano ya fueron tomadas por él en
décadas pasadas. Erik Hollnagel fue instrumental en contribuir a moldear las
ideas en el capítulo 6, y Jim Nyce ha tenido una influencia significativa en el
capítulo 9.
Quiero agradecer también a mis estudiantes, particularmente a Arthur Dijkstra y
Margareta Lutzhoft, por sus comentarios en borradores previos y sus útiles
sugerencias. Margareta merece especial gratitud por su ayuda en decodificar el
caso estudiado en el capítulo 5, y Arthur, por su habilidad para señalar
“Ansiedad Cartesiana”, donde yo no la reconocí.
Agradecimiento especial al editor de series Barry Kantowitz y al editor Bill
Webber, por su confianza en el proyecto. El trabajo para este libro fue apoyado
por una subvención del Swedish Flight Safety Directorate.
Prefacio
Los factores humanos en el transporte siempre han sido relacionados con el
error humano. De hecho, como un campo de investigación científica, debe su
inclusión a investigaciones de error de piloto y a la subsiguiente insatisfacción
de los investigadores con la etiqueta.
En 1947, Paul Fitts y Richard Jones, construyendo sobre trabajo pionero por
Alphonse Chapanis, demostraron como características de las cabinas de
aviones de la II Guerra Mundial influenciaban sistemáticamente la forma en que
pilotos cometían errores. Por ejemplo, pilotos confundían las palancas de flaps
y tren de aterrizaje porque éstas a menudo se veían y se sentían igual, y
estaban ubicadas próximas una de otra (switches idénticos, o palancas muy
similares). En el incidente típico, un piloto podía levantar el tren de aterrizaje en
vez de los flaps, luego de un aterrizaje, con las previsibles consecuencias para
hélices, motores y estructura. Como un inmediato arreglo de guerra, una rueda
de goma fue adherida al control del tren de aterrizaje y un pequeño terminal
con forma de cuña, al control del flan. Esto básicamente solucionó el problema
y el arreglo de diseño eventualmente llegó a ser un requerimiento de
certificación.
Los pilotos podían también mezclar los controles de acelerador, mezcla y
hélice, ya que sus ubicaciones cambiaban en diferentes cabinas. Tales errores
no fueron sorprendentes, degradaciones aleatorias de desempeño humano. De
preferencia, ellos fueron acciones y cálculos que tuvieron sentido una vez que
los investigadores comprendieron las características del mundo en que las
personas trabajaban, una vez que ellos hubieron analizado la situación que
rodeaba al operador. Los errores humanos están sistemáticamente conectados
a características de las herramientas y tareas de las personas. Puede ser difícil
predecir cuándo o qué tan a menudo ocurrirán los errores (a pesar que las
técnicas de fiabilidad humana ciertamente han intentado). Con una
examinación crítica del sistema en que las personas trabajan, sin embargo, no
es tan difícil anticipar dónde ocurrirán los errores. Factores Humanos ha
utilizado esta premisa desde siempre: La noción de diseñar sistemas
resistentes y tolerantes al error se basa en ello.
Factores Humanos fue precedido por una era de hielo mental de
comportamientismo, en que cualquier estudio de la mente era visto como
ilegítimo y no científico. El comportamientismo en sí ha sido una psicología de
protesta, acuñada en agudo contraste entre la introspección experimental de
Wundt que la precedió. Si el comportamientismo fue una psicología de
protesta, entonces factores humanos fue una psicología pragmática. La
Segunda Guerra Mundial trajo tal furioso paso de desarrollo tecnológico que el
comportamientismo fue encontrado manos abajo. Surgieron problemas
prácticos en la vigilancia y toma de decisiones del operador que fueron
totalmente inmunes al repertorio de exhortaciones motivacionales del
comportamiento de Watson. Hasta ese punto, la psicología había asumido
ampliamente que el mundo estaba arreglado, y que los humanos tenían que
adaptarse a sus demandas a través de la selección y el entrenamiento.
Factores Humanos mostró que el mundo no estaba arreglado: Cambios en el
ambiente podrían fácilmente llevar a incrementos en el desempeño no
alcanzables mediante intervenciones comportamientistas. En el
comportamientismo, el rendimiento tenía que ser adaptado luego de las
características del mundo. En factores humanos, características del mundo
fueron adaptadas luego de los límites y capacidades del desempeño.
Como una psicología de lo pragmático, factores humanos adoptó la visión de
ciencia y método científico Cartesiano-Newtoniano (tal como Wundt y Watson
habían hecho). Descartes y Newton fueron jugadores dominantes en la
revolución científica del siglo XVII. Esta transformación total en el pensamiento
instaló una creencia en la absoluta certeza del conocimiento científico,
especialmente en la cultura occidental. El ánimo de la ciencia fue de alcanzar el
control al derivar leyes de la naturaleza generales e idealmente matemáticas
(tal como nosotros intentamos hacer para el desempeño de la persona y el
sistema). Una herencia de esto puede ser vista todavía en factores humanos,
particularmente en la predominación de experimentos, lo nomotético más que
la inclinación ideográfica de su investigación y una fuerte fe en el realismo de
los hechos observados. También puede ser reconocida en las estrategias
reductivas con que se relacionan los factores humanos y la seguridad
operacional de sistemas para lidiar con la complejidad. La solución de
problemas Cartesiano-Newtoniana es analítica. Consiste en romper con los
pensamientos y problemas en piezas y en arreglarlas en algún orden lógico. El
fenómeno necesita ser descompuesto en partes más básicas y su totalidad
puede ser explicada exhaustivamente haciendo referencia a sus componentes
constituyentes y sus interacciones. En factores humanos y seguridad
operacional de sistemas, se entiende mente como una construcción tipo caja,
con un intercambio mecánico en representaciones internas; trabajo está
separado en pasos procedimentales a través de análisis de tareas jerárquicos;
las organizaciones no son orgánicas o dinámicas, sino que están constituidas
por estratos estáticos y compartimientos y lazos; y seguridad operacional es
una propiedad estructural que puede ser entendida en términos de sus
mecanismos de orden más bajo (sistemas de reporte, tasas de error y
auditorías, la función de la administración de seguridad operacional en el
diagrama organizacional, y sistemas de calidad).
Estas visiones están con nosotros hoy. Dominan el pensamiento en factores
humanos y seguridad operacional de sistemas. El problema es que extensiones
lineares de estas mismas nociones no pueden trasladarnos dentro del futuro.
Las una vez pragmáticas ideas de factores humanos y seguridad de sistemas
están cayendo detrás de los problemas prácticos que han comenzado a surgir
en el mundo de hoy. Podríamos estar dentro para una repetición de los
cambios que vinieron con los desarrollos tecnológicos de la II Guerra Mundial,
donde el comportamientismo mostró quedar corto. Esta vez podría ser el caso
de factores humanos y seguridad operacional de sistemas. Los desarrollos
contemporáneos, sin embargo, no son solo técnicos. Hay sociotécnicos: La
comprensión sobre qué hace a los sistemas seguros o frágiles requiere más
que conocimiento sobre la interfase hombre-máquina. Como David Meister
señaló recientemente (y el ha estado cerca por un tiempo), factores humanos
no ha progresado mucho desde 1950. “Hemos tenido 50 años de
investigación”, él se pregunta retóricamente, “¿pero cuanto más de lo que
sabíamos en un principio, sabemos?” (Meister 2003, p. 5). No es que las
propuestas tomadas por factores humanos y seguridad operacional de
sistemas ya no sean útiles, sino que su utilidad sólo puede ser apreciada
realmente cuando vemos sus límites. Este libro no es sino un capítulo en una
transformación más larga que ha comenzado a identificar las profundamente
enraizadas restricciones y los nuevos puntos de influencia en nuestras visiones
de factores humanos y seguridad operacional de sistemas. Las 10 preguntas
acerca del error humano no son solo preguntas sobre el error humano como un
fenómeno, si es que lo son (y si el error humano es algo en y por sí mismo, en
primer lugar). En realidad son preguntas acerca de factores humanos y
seguridad operacional de sistemas como disciplinas, y en qué lugar se
encuentran hoy. En formular estas preguntas acerca del error, y en trazar las
respuestas a ellas, este libro intenta mostrar dónde nuestro pensamiento
corriente está limitado; dónde nuestro vocabulario, nuestros modelos y
nuestras ideas están limitando el progreso. En cada capítulo, el libro intenta
entregar indicaciones para nuevas ideas y modelos que tal vez se las puedan
arreglar mejor con la complejidad de los problemas que nos encaran ahora.
Uno de esos problemas es que sistemas aparentemente seguros pueden
desviarse y fallar. Desviarse en dirección a los márgenes de seguridad
operacional ocurre bajo presiones de escasez y competencia. Está relacionado
con la opacidad de sistemas socio técnicos grandes y complejos, y los patrones
de información en que los integrantes basan sus decisiones y tratos. Derivaren
fallas está asociado con los procesos organizacionales normales de
adaptación. Las fallas organizacionales en sistemas seguros no están
precedidas por fallas; lo están por el quiebre o carencia de calidad de
componentes aislados. De hecho, la falla organizacional en sistemas seguros
está precedida por trabajo normal, por personas normales haciendo trabajo
normal en organizaciones aparentemente normales. Esto aparenta competir
severamente con la definición de un incidente, y puede minar el valor de
reportar incidentes como una herramienta para aprender más allá de un cierto
nivel de seguridad operacional. El margen entre el trabajo normal y el incidente
es claramente elástico y sujeto a revisión incremental. Con cada pequeño paso
fuera de las normas previas, el éxito pasado puede ser tomado como una
garantía de seguridad operacional futura.
El incrementalismo mella el sistema completo crece de la línea de derrumbe,
pero sin indicaciones empíricas poderosas de que está encaminado de esa
forma.
Modelos corrientes de factores humanos y seguridad operacional de sistemas
no pueden lidiar con la derivación hacia fallas. Ellos requieren fallas como un
prerrequisito para las fallas. Ellos aún están orientados hacia el encuentro de
fallas (por ejemplo, errores humanos, hoyos en las capas de defensa,
problemas latentes, deficiencias organizacionales y patógenos residentes), y se
relacionan con niveles de trabajo y estructura dictados externamente, por sobre
tomar cuentas internas (sobre qué es una falla vs. Trabajo normal) como
cánones. Procesos de toma de sentido, de la creación de racionalidad local por
quienes de verdad realizan los miles de pequeños y mayores tratados que
transportan un sistema a lo largo de su curso de deriva, yacen fuera del léxico
actual de factores humanos. Los modelos corrientes típicamente ven a las
organizaciones como máquinas Newtonianas-cartesianas con componentes y
nexos entre ellas. Los contratiempos son modelados como una secuencia de
eventos (acciones y reacciones) entre un disparador y un resultado. Tales
modelos no pueden pronunciarse acerca de la construcción de fallas latentes,
ni sobre la gradual, incremental soltura o pérdida de control.
Los procesos de erosión de las restricciones, de detrimentos de la seguridad
operacional, de desviación hacia los márgenes no pueden ser capturados
porque los enfoques estructurales son metáforas estáticas para formas
resultantes, no modelos dinámicos orientados hacia procesos de formación.
Newton y Descartes, con sus particulares estudios en ciencias naturales, tienen
una firme atadura en factores humanos, seguridad operacional de sistemas y
también en otras áreas. El paradigma de procesamiento de información, por
ejemplo, tan útil para explicar tempranamente los problemas de transferencia
de información entre radar y radio operadores en la II Guerra Mundial, solo ha
colonizado la investigación de factores humanos. Aún es una fuerza dominante,
reforzado por los experimentos del Spartan laboratory, que parecen confirmar
su utilidad y validez. El paradigma tiene mente mecanizada, partida en
componentes separados (por ejemplo, memoria de trabajo, memoria de corto
plazo y memoria de largo plazo) con nexos entre medio. Newton habría amado
su mecánica. A Descartes también le habría gustado: Una separación clara
entre mente y mundo solucionaba (o circunvalada, más bien) una serie de
problemas asociados con las transacciones entre ambos. Un modelo mecánico
tal como procesamiento de información, claro que mantiene apego especial por
la ingeniería y otros consumidores de los resultados de la investigación de
factores humanos. Dictado pragmático salvando las diferencias entre práctica y
ciencia, y el tener un modelo cognitivo similar a un aparato técnico familiar para
gente aplicada, es una forma poderosa de hacer sólo eso. Pero no existe razón
empírica para restringir nuestra comprensión de actitudes, memorias o
heurísticos, como disposiciones codificadas mentalmente, como ciertos
contenidos de conciencia con determinadas fechas de vencimiento. De hecho,
tal modelo restringe severamente nuestra habilidad para comprender cómo las
personas utilizan el habla y la acción para construir un orden perceptual y
social; cómo, a través del discurso y la acción, las personas crean los
ambientes que, a cambio, determinan la acción posterior y asesorías posibles,
y que restringen lo que, en consecuencia, será visto como discurso aceptable o
decisiones racionales. No podemos comenzar a entender la deriva en fallas, sin
comprender cómo grupos de personas, a través de cálculo y acción,
ensamblan versiones del mundo en las que ellos calculan y actúan.
El procesamiento de la información cabe dentro de una perspectiva metateórica
mayor y dominante, que toma al individuo como su foco central (Heft, 2001).
Esta visión, también, es una herencia de la Revolución Científica, la que ha
popularizado incrementadamente la idea humanista de un “individuo auto
contenido”. Para la mayoría de la psicología, esto ha significado que todos los
procesos dignos de estudio toman lugar dentro de los márgenes del cuerpo (o
mente), algo epitomizado por el enfoque mentalista del procesamiento de
información. En su incapacidad para tratar significativamente la deriva hacia la
falla, que interconecta factores individuales, institucionales, sociales y técnicos,
los factores humanos y la seguridad operacional de sistemas están
actualmente pagando por su exclusión teórica de los procesos sociales y
transaccionales, entre los individuos y el mundo. El componencialismo y la
fragmentación de la investigación de factores humanos aún es un obstáculo al
progreso en este sentido. Un estiramiento de la unidad de análisis (como lo
hecho en las ideas de ingeniería de sistemas cognitivos y cognición
distribuida), y una llamada actuar centralmente en comprender los cálculos y el
pensamiento, han sido formas de lidiar con los nuevos desarrollos prácticos
para los que los factores humanos y la seguridad de sistemas, no estaban
preparados.
El énfasis individualista del protestantismo y la iluminación, también reboza de
ideas sobre control y culpa. ¿Debemos culpar a las personas por sus errores?
Los sistemas sociotécnicos han crecido en complejidad y tamaño, moviendo a
algunos a decir que no tiene sentido esperar o demandar de los integrantes
(ingenieros, administradores, operadores), que giren en torno a algún ideal
moral reflectivo. Presiones de escasez y competencia, han logrado convertirse
insidiosamente en mandatos organizacionales e individuales, los que a cambio,
restringen severamente la racionalidad y opciones (y por ende autonomía), de
todos los actores en el interior. Ya sólo los antihéroes continúan teniendo roles
líderes en nuestras historias de fallas. El individualismo aún es crucial para la
propia identidad en la modernidad. La idea que lleva a un trabajo de equipo, a
una organización entera, o a toda una industria a quebrar un sistema (como se
ilustró mediante casos de deriva en fallas) es muy poco convencional respecto
de nuestras preconcepciones culturales heredadas. Incluso antes que
llegáramos a episodios complejos de acción y responsabilidad, podemos
reconocer la prominencia de la de-construcción y componentalismo
Newtoniano-Cartesianos, en mucha investigación de factores humanos. Por
ejemplo: Las nociones empíricas de una percepción de elementos que
gradualmente se fueron convirtiendo en significado, a través de etapas de
procesamiento mental, son nociones teóricas legítimas hoy. El empirismo fue
otrora una fuerza en la historia de la psicología. Incluso atascado por el
paradigma de procesamiento de información, sus principios centrales han
retrocedido, por ejemplo, en teorías de conciencia situacional. En adoptar un
modelo cultural como tal, desde una comunidad aplicada y sometiéndolo a
escrutinio científico putativo, por supuesto que los factores humanos
encuentran su ideal pragmático. Los modelos culturales abarcan los problemas
de factores humanos como una disciplina aplicada. Pocas teorías pueden
cubrir el abismo entre investigador y practicantes mejor que aquellas que
aplican y disectan los vernáculos practicantes para estudio científico. Pero los
modelos culturales vienen con una etiqueta de precio epistemiológica. La
investigación que se adjudica indagar un fenómeno (digamos, conciencia
situacional dividida, o complacencia), pero que no definen ese fenómeno
(porque, como modelo cultural, se supone que todos saben lo que significa) no
puede falsear el contacto con la realidad empírica. Ello deja a tal investigador
de factores humanos sin el mecanismo mayor de control científico desde Kart
Popper.
Conectado al procesamiento de información, y al enfoque experimental a
muchos problemas de factores humanos, está un prejuicio cuantitativo, primero
competido en la psicología por Wilhelm Wundt, en su laboratorio en Leipzig. A
pesar de que Wundt rápidamente tuvo que admitir que una cronometría de la
mente era una meta muy audaz de la investigación, los proyectos
experimentales de investigación sobre factores humanos aún pueden reflejar
versiones pálidas de su ambición. Contar, medir, categorizar y analizar
estadísticamente, son herramientas gobernantes del tratado, mientras que las
investigaciones cualitativas son a menudo desechadas por subjetivas y no
científicas. Los factores humanos tienen una orientación realista, creyendo que
los hechos empíricos son aspectos estables y objetivos de la realidad que
existe independiente del observador o su teoría. Nada de esto hace menos
reales los hechos generados mediante experimentos, para aquellos que
observan, publican, o leen acerca de ellos. Sin embargo, al apreciar a Thomas
Kuhn (1962), esta realidad debe ser vista por lo que es: un acuerdo negociado
implícitamente entre investigadores de pensamientos similares, más que un
común denominador accesible a todos.
No hay arbitrio final aquí. Es posible que un enfoque experimental,
componencial, pueda disfrutar de un privilegio epistemiológico. Pero ello
tambien significa que no hay imperativo automático para únicamente
sostenerse por la investigación legítima, como se ve a veces en la corriente
principal de factores humanos. Las formas de obtener acceso a la realidad
emírica son infinitamente negociables, y su aceptación es una función de qué
tan bien ellos dan conformidad a la visión mundial de aquellos a quienes el
investigador apela.
La persistente supremacía cuantitativista (particularmente en los factores
humanos norteamericanos), se ve apesadumbrada con este tipo de autoridad
consensuada (debe ser bueno porque todos lo están haciendo). Tal histéresis
metodológica podría tener que ver más con los miedos primarios de ser
marcado “no científico” (los miedos compartidos por Wundt y Watson) que con
un retorno estable de incrementos significativos de conocimiento generados por
la investigación.
El cambio tecnológico dio impulso a los pensamientos de factores humanos y
seguridad de sistemas. Las demandas prácticas puestas por los cambios
tecnológicos envolvieron a los factores humanos y la seguridad de sistemas
con el espíritu pragmático que hasta hoy tienen. Pero lo pragmático no es más
pragmático si no encaja con las demandas creadas por aquello que está
sucediendo ahora en nuestro alrededor. El paso del cambio sociotecnológico
no tiende a desacelerar pronto. Si creemos que la II Guerra Mundial generó
una gran cantidad de cambios interesantes, dando a luz a los factores
humanos como una disciplina, entonces podríamos estar viviendo en tiempos
incluso más excitantes hoy. Si nosotros nos mantenemos haciendo lo que
hemos estado realizando en factores humanos y seguridad de sistemas,
simplemente porque nos ha funcionado en el pasado, podríamos llegar a ser
uno de esos sistemas que derivan hacia la falla. Lo pragmático requiere que
nosotros nos adaptemos también, para arreglárnoslas mejor con la complejidad
del mundo que nos enfrenta hoy. Nuestros éxitos pasados no son garantía de
continuar logros futuros.
Prólogo de la serie.
Barry H. Kantowitz
Battelle Human Factors Transportation Center
El rubro del transporte es importante, por razones tanto prácticas como
teóricas. Todos nosotros somos usuarios de sistemas de transporte como
operadores, pasajeros y consumidores. Desde un punto de vista científico, el
rubro del transporte ofrece una oportunidad de crear y probar modelos
sofisticados de comportamiento y cognición humanos. Esta serie cubre los
aspectos práctico y teórico de los factores humanos en el transporte, con un
énfasis en su interacción.
La serie es interpretada como un foro para investigadores e ingenieros
interesadoes en cómo funcionan las personas dentro de sistemas de
transporte. Todos los modos de transporte son relevantes, y todos los
esfuerzos en factores humanos y ergonomía que tienen implicancias explícitas
para los sistemas de transporte caen en una visión pobre en serie. Esfuerzos
analíticos son importantes para relacionar teoría y datos. El nivel de análisis
puede ser tan pequeño como una persona, o de espectro internacional. Los
datos empíricos pueden provenir de un amplio rango de metodologías,
incluyendo investigación de laboratorio, estudios simulados, seguimiento de
pruebas, pruebas operacionales, trabajo en el campo, revisiones de diseños, o
peritajes. Este amplio espectro es interpretado para maximizar la utilidad de la
serie para lectores con trasfondos distintos.
Espero que la serie sea útil para profesionales en las disciplinas de factores
humanos, ergonomía, ingeniería de transportes, psicología experimental,
ciencia cognitiva, sociología e ingeniería de seguridad operacional. Está
orientada a la apreciación de especialistas de transporte en la industria,
gobierno, o académicos, así como también, al investigador en busca de una
base de pruebas para nuevas ideas acerca de la interfase entre las personas y
sistemas complejos.
Este libro, mientras se enfoca en el error humano, ofrece una visión de
sistemaparticularmente bienvenida en los factores humanos del transporte. Una
meta mayor de esta serie de libros es relacionar la teoría y la práctica de
factores humanos. El autor es encomendado para formular preguntas que no
sólo realacionan teoría y práctica, sino que fuerzan al lector a evaluar las
clases de teoría como las aplicadas a factores humanos. Los enfoques de
información tradicionales, derivados del modelo de canal limitado que ha
formado las bases originales para el trabajo teórico en factores humanos, son
escrutados. Enfoques más nuevos, tales como la conciencia situacional, que
procedía de deficiencias en el modelo de teoría de la información, son
criticados por tratarse solo de modelos culturales carentes de rigor científico.
Espero que este libro engendre un vigoroso debate sobre qué clases de teoría
sirven mejor a la ciencia de factores humanos. Si bien, las diez preguntas
ofrecidas aquí forman una base para debate, existen más de diez respuestas
posibles.
Los libros posteriores en esta serie, continuarán buscando estas respuestas
mediante la entrega de perspectivas prácticas y teóricas en los factores
humanos en el transporte.
Nota del Autor.
Sidney Dekker es profesor de Factores Humanos en la Universidad Lund,
Suecia. El recibió un M.A. en psicología organizacional de la University of
Nijmegen y un M.A. en psicología experimental de la Leiden University, ambas
en Noruega. El ganó su Ph.D. en Ingeniería de Sistemas Cognitivos de la Ohio
State University.
Ha trabajado previamente para el Public Transport Cooperation in Melbourne,
Australia; la Massey University School of Aviation, Nueva Zelanda; y la British
Aerospace. Sus especialidades e intereses investigativos son el error humano,
investigación de accidentes, estudios de campo, diseño representativo y,
automatización.
Ha tenido alguna experiencia como piloto, entrenado en material DC-9 y Airbus
A340. Sus libros previos incluyen The Field Guide to Human Error
Investigations (2002).
Capítulo 1.
¿Fue Falla Mecánica o Error Humano?
Estos son tiempos excitants y competitivos para factores humanos y seguridad
operacional de sistemas. Y existen indicaciones sobre que no estaremos
completamente bien equipados para ellos. Hay un reconocimiento creciente
que los accidentes (un accidente de avión comercial, un desastre de un
transbordador espacial) están intrincadamente ligados al funcionamiento de
organizaciones e instituciones aledañas. La operación de aviones de
aerolíneas comerciales o transbordadores espaciales o traslados de pasajeros,
engendra vastas redes de organizaciones de apoyo, de mejoramiento y
avance, de control y regulación. Tecnologías complejas no pueden existir sin
estas organizaciones e instituciones – transportadores, reguladores, agencias
de gobierno, fabricantes, subcontratistas, instalaciones de mantenimiento,
grupos de entrenamiento – que, en principio, están diseñadas para proteger y
dar seguridad a su operación. Su mandato real se orienta a no tener
accidentes. Desde el accidente nuclear en Three Mile Island, en 1978, sin
embargo, las personas se percatan en mayor medida que las mismas
organizaciones destinadas a mantener una tecnología segura y estable
(operadores humanos, reguladores, la administración, el mantenimiento), están
en realidad entre los mayore contribuyentes al quiebre. Las fallas socio-
tecnológicas son imposibles sin tales contribuciones.
En desmedro de este reconocimiento creciente, factores humanos y seguridad
operacional de sistemas dependen de un vocabulario basado en una
concepción particular de las ciencias naturales, derivada de sus raíces en la
ingeniería y en la psicología experimental. Este vocabulario, el uso sutil de
metáforas, imágenes e ideas, está más y más de acuerdo con las demandas
interpretativas puestas por los accidentes organizacionales modernos. El
vocabulario expresa una visión mundial (tal vez), apropiada para las fallas
técnicas, pero incapaz de abrazar y penetrar las áreas relevantes de fallas
socio-técnicas – esas fallas que incorporan los efectos interconectados de la
tecnología y de la complejidad social organizada circundando su uso. Lo que
significa, más fallas hoy.
Cualquier lenguaje, y la visión mundial que lo acompaña, impone limitaciones
en nuestro entendimiento de la falla. Sin embargo, estas limitaciones ahora
están volviéndose incrementadamente evidentes y presionantes. Con el
crecimiento en el tamaño y la complejidad del sistema, la naturaleza de los
accidentes está cambiando (accidentes de sistemas, fallas sociotécnicas). La
escasez y competitividad por recursos significa que los sistemas presionan
incrementadamente sus operaciones hacia los bordes de sus coberturas de
seguridad. Ellos tienen que hacerlo para permanecer exitosos en sus
ambientes dinámicos. Los retornos comerciales al estar en los límites son
mayores, pero las diferencias entre tener y no tener un accidente están
caóticamente superando los márgenes disponibles. Los sistemas abiertos son
remolcados continuamente hacia dentro de sus áreas de seguridad
operacional, y los procesos que impulsan tal migración no son sencillos de
reconocer o controlar, como tampoco la ubicación exacta de los márgenes. Los
sistemas grandes, complejos, se ven capaces de adquirir una histéresis, una
oscura voluntad propia, en la que derivan hacia mayor elasticidad o hacia los
bordes de la falla. Al mismo tiempo, el veloz avance de los cambios
tecnológicos crea nuevos tipos de peligros, especialmente aquellos que vienen
con mayor dependencia en la tecnología computacional. Ambos sistemas,
social e ingeniero (y su interrelación), se relacionan a un siempre mayor
volumen de tecnología de información. A pesar de nuestra velocidad
computacional y de que el acceso a la información pudiera parecer una ventaja
de seguridad operacional en principio, nuestra habilidad de tomar conciencia de
la información no está manteniendo el paso con nuestra habilidad para
recolectarla y generarla. Al conocer más, puede que en realidad conozcamos
mucho menos. Administrar la seguridad operacional en base a números
(incidentes, conteos de error, amenazas a la seguridad operacional), como si la
seguridad operacional fuera sólo otro indicador de un modelo de negocios de
Harvard, puede crear una falsa impresión de racionalidad y control
administrativo. Puede ignorar variables de orden más alto que pueden
develarla verdadera naturaleza y dirección de la deriva del sistema. Podría
venir además, al costo de comprensiones más profundas del funcionamiento
socio-técnico real.
DECONSTRUCCION, DUALISMO y ESTRUCTURALISMO.
¿Entonces qué es este idioma y la visión mundial técnica obsoleta que
representa? Las características que lo definen son la deconstrucción, el
dualismo y el estructuralismo. Deconstrucción significa que el funcionamiento
de un sistema puede ser comprendido exhaustivamente al estudiar la
distribución y la interacción de sus partes constituyentes. Científicos e
ingenieros típicamente miran al mundo de esta forma. Las investigaciones de
accidentes también deconstruyen. Para determinar la falla mecánica, o para
ocasionar las partes dañinas, los investigadores de accidentes hablan de
“ingeniería reversa”. Ellos recuperan partes de los restos y las reconstruyen en
un todo nuevamente, a menudo literalmente. Pensemos en el TWA800 Boeing
747 que explotó en el aire luego del despegue desde el aeropuerto Kennedy de
Nueva York, en 1998. Fue recuperado desde el fondo del Océano Atlántico y
dolorosamente rearmado, en un hangar. Con el rompecabezas lo más
completo como posible, las partes dañadas debieron eventualmente quedar
expuestas, permitiendo a los investigadores identificar la fuente de la explosión.
Pero continúa desafiando al sentido, continúa siendo un rompecabezas sólo
cuando el funcionamiento (o no funcionamiento) de sus partes falla al explicar
el todo. La parte que causó la explosión, que la inició, nunca fue identificada en
verdad. Esto es lo que hace escalofriante la investigación del TWA800. En
desmedro de una de las más caras reconstrucciones en la historia, las partes
reconstruidas rechazaron contar por el comportamiento del todo. En un caso
como tal, una comprensión atemorizante, incierta, da escalofríos a los cuerpos
de investigación y a la industria. Un todo falló sin una parte fallada. Un
accidente ocurrió sin una causa; no hay causa – nada que reparar, nada que
reparar – y podría suceder mañana nuevamente, u hoy.
La segunda característica definitoria es el dualismo. Dualismo significa que
existe una separación distintiva entre causa humanan y material – entre el error
humano y la falla mecánica --. Para ser un buen dualista usted, por supuesto,
tiene que deconstruir: Usted tiene ue desconectar las contribuciones humanas
de las contribuciones mecánicas. Las reglas de la Organización de Aviación
Civil Internacional, que gobierna a los investigadores de accidentes aéreos lo
determina expresamente. Ellos fuerzan a los investigadores de accidentes a
separar las contribuciones humanas de las mecánicas. Parámetros específicos
en los reportes de accidentes están reservados para el seguimiento de los
componentes humanos potencialmente dañados. Los investigadores exploran
el historial de las 24 y 72 horas previas de los humanos que más tarde se
cerían involucrados en un accidente. ¿Hubo alcohol? ¿Hubo estrés? ¿Hubo
fatiga? ¿Hubo falta de eficiencia o experiencia? ¿Hubo problemas previos en
los registros de entrenamiento u operacionales de estas personas? ¿Cuántas
horas de vuelo tenía verdaderamente el piloto? ¿Hubo otras distracciones o
problemas? Este requisito investigativo refleja una interpretación primitiva de
los factores humanos, una tradición aeromédica en que el error humano está
reducido a la noción de “estar en forma para el servicio”. Esta noción ha sido
sobrepasada hace tiempo por los desarrollos en factores humanos hacia el
estudio de personas normales realizando trabajos normales en lugares de
trabajo normales (más que en individuos deficientes mental o fisiológicamente),
pero el modelo aeromédico sobre extendido es retenido como una clase de
práctica conformista positivista, dualista y deconstructiva. En el paradigma de
estar en forma para el servicio, las fuentes de error humano debieron ser
buscadas en las horas, días o años previos al accidente, cuando el
componente humano estaba torcido, debilitado y listo para el quiebre.
Encuentre la parte del humano que estaba perdida o deficiente, la “parte
desajustada”, y la parte humana acarreará la carga interpretativa del accidente.
Indague en la historia reciente, encuentre las piezas deficientes y arme el
rompecabezas: deconstrucción, reconstrucción y dualismo.
La tercera característica definitoria de la visión mundial técnica que aún
gobierna nuestro entendimiento de éxito y fallas en en sistemas complejos es el
estructuralismo. El idioma que utilizamos para describir los trabajadores
internos de sistemas de éxito y fallas es un idioma de estructuras. Hablamos de
capas de defensa, de agujeros en estas capas. Identificamos los “bordes
suaves” y los “bordes agudos” de organizaciones e intentamos capturar como
una tiene efectos sobre la otra. Incluso la cultura de seguridad es tratada como
una estructura edificada por otros bloques. Qué tanta cultura de seguridad
tenga una organización depende de kas rytubas y componentes que tenga para
el reporte de incidentes (esto es mesurable), de hasta qué punto es justa con
los operadores que cometen errores (esto es más difícil de medir, pero todavía
posible), y de qué relación tiene entre sus funciones de seguridad y otras
estructuras institucionales. Una realidad social profundamente compleja está
por ende, reducida a un limitado número de componentes mesurables. Por
ejemplo ¿tiene el departamento de seguridad una ruta directa a la
administración más alta? ¿Cómo es esta tasa de reportes comparada a otras
compañías?
Nuestro idioma de fallas también es un idioma de mecánica. Describimos
trayectorias de accidents, buscamos causas y efectos, e interacciones.
Buscamos fallas iniciadoras, o eventos gatilladores, y seguimos el colapso del
sistema estilo dominó, que le sigue. Esta visión mundial ve a los sistemas
socio-técnicos como máquinas con partes en una distribución particular (bordes
agudos vs. suaves, capas de defensa), con interacciones particulares
(trayectorias, efectos dominó, gatillos, iniciadores), y una mezcla de variables
independientes o intervinientes (cultura de la culpa vs. cultura de seguridad).
Esta es la visión mundial heredada de Descartes y Newton, la visión mundial
que ha impulsado exitosamente el desarrollo tecnológico desde la revolución
científica hace medio milenio. La visión mundial, y el idioma que produce, está
basada en nociones particulares de ciencias naturales, y ejerce una sutil pero
muy poderosa influencia en nuestra comprensión del éxito y falla socio
tecnológicos hoy.
Así como ocurre con mucha de la ciencia y pensamiento occidentales, perdura
y dirige la orientación de factores humanos y seguridad de sistemas.
Incluso el idioma, si se utiliza irreflexivamente, se vuelve fácilmente
aprisionante. El idioma expresa, pero también determina qué podemos ver y
cómo lo vemos.
El idioma constriñe como construimos la realidad. Si nuestras metáforas nos
dieran el coraje para modelar cadenas de accidentes, entonces comenzaremos
nuestra investigación buscano eventos que encajen en esa cadena. ¿Pero qué
eventos deben ir adentro? ¿Dónde debemos comenzar? Como Nancy Leveson
(2002) señaló, la elección de cuales eventos poner dentro es arbitraria, así
como la extensión, el punto de partida y el nivel de detalle de la cadena de
eventos. Qué, preguntó ella, justifica asumir que los eventos inciales son
mutuamente exclusivos, excepto que ello simplifica las matemáticas del modelo
de la falla? Estos aspectos de la tecnología y de su operación, encumbran
preguntas sobre lo apropiado del modelo dualista, deconstructivo,
estructuralista que domina factores humanos y seguridad de sistemas. En su
lugar, podríamos buscar una visión de sistemas real, que no solo apunte las
deficiencias estructurales detrás de los errores humanos individuales (si se
necesita ello lo puede hacer), pero que aprecia la adaptabilidad orgánica,
ecológica, de sistemas sociotécnicos complejos.
Buscando fallas para explicar fallas.
Nuestras creencias y credos más arraigados a menudo permanecen
encerrados en la más simple pregunta.La pregunta acerca de si el error
humano o la falla mecánica es uno de ellos. ¿Fue el accidente causado por
falla mecánica o por error humano? Es una pregunta existencial para las
repercusiones posteriores de un accidente.
Más aún, se ve como una pregunta tan simple e inocente. Para muchos es una
consulta normal de preguntar: Si has tenido un accidente, tiene sentido
averiguar que falló. La pregunta, sin embargo, envuelve una comprensión
particular de cómo los accidentes ocurren, y sus riesgos confinando nuestro
análisis causal de esa comprensión. Nos presenta en un repertorio
interpretativo arreglado. Escapar de este repertorio puede ser difícil. Fija las
preguntas que hacemos, entrega las cabezas que perseguimos y las claves
que examinamos, y determina las conclusiones que eventualmente
esbozaremos. ¿Qué componentes estaban dañados? ¿Fue algo maquinario o
algo humano? ¿Por cuanto tiempo había estado torcido el componente o, de
otra forma, deficiente? ¿Por qué se quebró eventualmente? ¿Cuáles fueron los
factores latentes que conspiraron en su contra? ¿Qué defensas hubieron
erosionado?
Estos son los tipos de preguntas que dominan las investigaciones en factores
humanos y seguridad de sistemas hoy en día. Organizamos reportes de
accidentes y nuestro discurso sobre accidentes alrededor de la lucha por
respuestas a ellos. Las investigaciones dan vuelta los componentes mecánicos
dañados (un perno dañado en el trim vertical de un MD-80 de Alaska Airlines,
azulejos refractantes de calor perforados en el transbordador espacial
Columbia), componentes de baja performance humana (por ejemplo, quiebres
en C.R.M., un piloto que tiene un accidentado historial de entrenamiento), y
grietas en las organizaciones responsables por el rodaje del sistema (por
ejemplo, cadenas de decisión organizacional débiles). El buscar fallas –
humanas, mecánicas, u organizacionales – para explicar fallas es tan de
sentido común que la mayoría de los investigadores nunca se detiene a pensar
si estas son en realidad las pistas correctas que perseguir. Que la falla está
causada por falla es pre racional – no la consideramos conscientemente más
como una pregunta en las decisiones que hacemos acerca de dónde mirar y
qué concluir.
Aquí hay un ejemplo. Un bimotor Douglas DC-9-82 aterrizó en un aeropuerto
regional en las Tierras Altas del Sur de Suecia en el verano de 1999.
Chubascos de lluvia habían pasado a través del area más temprano, y la pista
estaba aún húmeda. Durante la aproximación a la pista, la aeronave recibió un
ligero viento de cola, y después del toque a tierra, la tripulación tuvo problemas
disminuyendo la velocidad. A pesar de los esfuerzos de la tripulación por
frenar, el jet recorrió la pista y terminó en un campo a unos pocos cientos de
pies del umbral. Los 119 pasajeros y la tripulación abordo resultaron ilesos.
Luego de su parálisis, uno de los pilotos salió de la aeronave para chequear los
frenos. Estaban fríos. No había ocurrido ninguna acción de freno. ¿Cómo pudo
haber ocurrido esto? Los investigadores no encontraron fallas mecánicas en la
aeronave. Los sistemas de freno estaban bien.
En vez de ello, a medida que la secuencia de eventos fue rebobinada en el
tiempo, los investigadores se percataron que la tripulación no había armado los
ground spoilers de la aeronave antes del aterrizaje. Los ground spoilers ayudan
a un jet a frenar durante la carrera, pero requieren ser armados antes de que
puedan hacer su trabajo. Armarlos es trabajo de los pilotos, y es un ítem de la
lista de chequeo before-landing y parte de los procedimientos en que ambos
miembros de la tripulación están envueltos. En este caso, los pilotos olvidaron
armar los spoilers. “Error de piloto”, concluyó la investigación.
O realmente, ellos lo llamaron “Desmoronamientos en CRM (Crew Resource
Management)” (Statens Haverikommision, 2000, p.12), una forma más
moderna, más eufemista de decir “error de piloto”. Los pilotos no coordinaron lo
que debían hacer; por alguna razon ellos fallaron en comunicar la configuración
requerida de su aeronave. Además, después del aterrizaje, uno de los
miembros de la tripulación no había dicho “¡Spoilers!”, como lo dicta el
procedimiento. Esto pudo o debió alertar a la tripulación sobre la situación, pero
ello no ocurrió. Los errores humanos habían sido encontrados. La investigación
estaba concluida.
“Error humano” es nuestra elección por defecto cuando no encontramos fallas
mecánicas. Es una elección forzada, inevitable, que se calza suficientemente
bien en una ecuación, donde el error humano es el inverso al monto de la falla
mecánica. La ecuación 1 muestra como determinamos la proporción de
responsabilidad causal:
Error humano = f(1 – falla mecánica) (1)
Si no existe falla mecánica, entonces sabemos qué comenzar a buscar en
reemplazo.
En este caso, no hubo falla mecánica. La ecuación 1 viene como una función
de 1 menos 0. La contribución humana fue 1. Fue error humano, un quiebre de
CRM. Los investigadores encontraron que los dos pilotos a bordo del MD-80
eran ambos capitanes, y no un capitán y un copiloto, como es usual. Fue una
simple coincidencia de una planificación no completamente inusual, un ajuste
elástico volar a bordo de esa aeronave desde esa mañana. Con dos capitanes
en un barco, las responsabilidades arriesgan ser divididas inestable e
incoherentemente.
La división de responsabilidades fácilmente conduce a su abdicación. Si es
función del copiloto verificar que los spoilers estén armados, y no hay copiloto,
el riesgo es obvio. La tripulación estaba en algún sentido “desajustada”, o a lo
menos, propensa al desmoronamiento. Así fue (hubo un “desmoronamiento de
CRM”). ¿Pero qué explica esto? Estos son procesos que ellos mismos
requieren una explicación, y pueden ser guías que se enfríen de todas formas.
Tal vez hay una realidad mucho más profunda, acechando entre las primeras
acciones particulares de un accidente como tal, una realidad en donde las
causas humanas y mecánicas están interconectadas de forma mucho más
profunda, que en nuestros enfoques formulaicos que nos permiten comprender
investigaciones. Para vislumbrar mejor esta realidad, primero tenemos que
girar hacia el dualismo. Es el dualismo que descansa en el corazón de la
elección entre error humano y falla mecánica. Echamos un breve vistazo a su
pasado y lo confrontamos con el encuentro empírico inestable, incierto, de un
caso de spoilers desarmados.
La miseria del dualismo.
La urgencia de separar la causa humana de la causa mecánica es algu que
debe haber encrucijado incluso a los pioneros en factores humanos. Pensar en
el enredo con las cabinas de la II Guerra Mundial, que tenían switches de
control idénticos para una diversidad de funciones. ¿Pudieron evitar, una aleta
tipo flap en el control del flap y un mecanismo con forma de rueda en el
elevador del tren, la confusión típica entre ambos? En ambos casos, el sentido
común y la experiencia dice “sí”. Al cambiar algo en el munco, los ingenieros en
factores humanos (suponiendo que ellos ya existían) cambiaron algo en el
humano. Al jugar con el hardware con que las personas trabajaban, ellos
cambiaron el potencial de las acciones correctas e incorrectas, pero sólo el
potencial. Porque incluso con palancas de control con formas funcionales,
algunos pilotos, en algunos casos, todvía las mezclaban. Al mismo tiempo, los
pilotos no siempre mezclaban switches idénticos. Similarmente, no todas las
tripulaciones que constan de dos capitanes fallan al armar los spoilers antes del
aterrizaje. El error humano, en otras palabras, está suspendido, inestable, en
algún lado entre las interfases humanas y mecánicas. El error no es
completamente humano, ni completamente mecánico. Al mismo tiempo, “fallas”
mecánicas (proveer switches idénticos ubicados próximos uno del otro) tienen
que expresarse ellos mismos en la acción humana. Así que, si ocurre una
confusión entre flaps y tren, entonces ¿cuál es la causa? ¿error humano o falla
mecánica? Usted necesita ambos para tener éxito; necesita que ambos fallen.
Donde termina uno y comienza el otro ya no está claro. Una idea del trabajo
temprano en factores humanos era que el componente mecánico y la acción
humana estaban interconectadas en formas que resisten el desarreglo dualista,
deconstruido, eficiente, preferido aún hoy por los investigadores (y sus
consumidores).
DUALISMO Y REVOLUCIÓN CIENTÍFICA.
La elección entre causa humana y causa material no es un simple producto de
la investigación de accidentes o la ingeniería en factores humanos recientes.
La elección se encuentra firmemente arraigada a la visión mundial
Newtoniana-Cartesiana que gobierna mucho de nuestro pensamiento hoy en
día, particularmente en profesiones dominadas por la tecnología como la
ingeniería de factores humanos y la investigación de accidentes.
Isaac Newton y Rene Descartes fueron dos de las figuras cúspide en la
Revolución Científica entre 1500 y 1700 D.C. quienes produjeron un cambio
dramático en la visión mundial, así como también, cambios profundos en el
conocimiento y en las ideas de cómo adquirir y probar el conocimiento.
Descartes propuso una aguda distinción entre lo que llamó res cogitans, el
dominio de la mente, y res extensa, el dominio del problema. Aunque
Descartes admitió alguna interacción entre los dos, insistió que el fenómeno
mental y físico no puede ser entendido haciendo referencia al otro. Los
problemas que ocurren en cualquiera de los dominios requieren enfoques
completamente separados y diferentes conceptos para resolverlos. La noción
de mundos mentales y materiales separados llegó a ser conocida como
dualismo y sus implicancias pueden ser reconocidas en mucho de lo que
pensamos y hacemos hoy en día. De acuerdo a Descartes, la mente está fuera
del orden físico del problema y en ninguna forma es derivada de él. La elección
entre error humano y falla mecánica, es tal elección dualista: De acuerdo a la
lógica Cartesiana, el error humano no puede derivar de cosas materiales.
Como veremos, esta lógica no se sustenta bien – de hecho, en una inspección
más cercana, todo el campo de factores humanos está basado en esta
afirmación.
Separar el cuerpo del alma, y subordinar el cuerpo al alma, no sólo mantuvo a
Descartes guera de problemas con la Iglesia. Su dualismo, su división entre
mente y problema, agregó un importante problema filosófico que tuvo el
potencial de sostener el progreso científico, tecnológico y social: ¿Cuál es el
nexo entre mente y problema, entre el alma y el mundo material? ¿Cómo
podríamos, como humanos, tomar el control y rehacer nuestro mundo físico lo
suficiente para que estuviera aleado indivisiblemente, o incluso fuera sinónimo,
con un alma irreductible, eterna? Una de las mayores aspiraciones durante la
Revolución Científica de los siglos XVI y XVII fue el ver y comprender (y llegar
a tener la capacidad de manipular) el mundo material como una máquina
controlable, predictible, programable. Esto lo requirió para ser visto como nada
más que una máquina: Sin vida, sin espíritu, sin alma, sin eternidad, sin
inmaterialismo, sin impredictibilidad. La res extensa de Descartes, o mundo
material, respondió sólo a esa inquietud. La res extensa fue descrita según el
trabajar como una máquina, seguir reglas mecánicas y permitir explicaciones
en términos de arreglo y movimiento de sus partes constituyentes. El progreso
científico llegó a ser más fácil a causa de lo que excluyó. Lo que requirió la
Revolución Científica, fue provisto por la desunión de Descartes. La naturaleza
se volvió una máquina perfecta, gobernada por leyes matemáticas que fueron
aumentando dentro de la comprensión del entendimiento y control humanos, y
lejos de las cosas que los seres humanos no pueden controlar.
Newton, por supuesto, es el padre de muchas de las leyes que aún gobiernan
nuestro entendimiento y universo hoy en día. Su tercera ley de movimiento, por
ejemplo, descansa en las bases de nuestras presunciones sobre causa y
efecto, y causas de accidentes: Para cada acción existe una reacción igual y
opuesta. En otras palabras, para cada causa existe un efecto equivalente, o
más bien, para cada efecto, tiene que haber una causa equivalente. Una ley
como tal, si bien sea aplicable a la liberación y transferencia de energía en
sistemas mecánicos, está erróneamente enfocada al ser aplicada a fallas
sociotécnicas, cuando las pequeñas vanalidades y sutilezas del trabajo normal
hecho por gente normal en organizaciones normales puede degenerar
lentamente en desastres enormes, en liberaciones de energía
desproporcionadamente altas. La equivalencia de causa-consecuencia dictada
por la tercera ley del movimiento de Newton, es bastante inapropiada como
modelo de accidentes organizacionales.
Adquirir control sobre un mundo material fue de crítica importancia para las
personas hace quinientos años. La tierra de inspiración y fertilidad para las
ideas de Descartes y Newton puede ser entenderse en el contraste de su
tiempo. Europa estaba emergiendo de la Edad Media –tiempos de temor y fe,
donde los lapsos de vida eran segados tempranamente por guerras,
enfermedad y epidemias. No deberíamos subestimar la ansiedad y aprensión
sobre la habilidad humana de enfocar sus esfuerzos contra estas míticas
profecías. Luego de la Plaga, a los habitantes de la Inglaterra nativa de
Newton, por ejemplo, les tomó hasta 1650 recuperar el nivel de 1300. La gente
estaba a merced de fuerzas no apenas controlables y comprendidas como
enfermedades. En el milenio precedente, la piedad, la oración y la penitencia
estaban entre los mecanismos directivos mediante los cuales la gente podía
alcanzar alguna clase de dominio sobre el mal y el desastre.
El crecimiento de la perspicacia producido por la Revolución Científica,
lentamente comenzó a entregar una alternativa, con éxito mensurable
empíricamente. La Revolución Científica entregó nuevos medios para controlar
el mundo natural.
Los telescopios y microscopios le dieron a la gente nuevas formas de estudiar
componentes que hasta entonces habían sido muy pequeños o habían estado
muy distantes para ser vistos por el ojo desnudo, abriendo de pronto una
visión del universo completamente nueva y por primera vez, revelando causas
de los fenómenos hasta entonces malamente comprendidos. La naturaleza no
fue un monolito atemorizante, inexpugnable, y las personas dejaron de estar
sólo en el final de sus victimarios caprichos. Al estudiarla de nuevas formas,
con nuevos instrumentos, la naturaleza podría ser descompuesta, partida en
trozos más pequeños, medida y, a través de todo eso, comprendida mejor y
eventualmente controlada.
Los avances en las matemáticas (geometría, álgebra, cálculo), generaron
modelos que pudieron contar para y predecir fenómenos recientemente
descubiertos en, por ejemplo, medicina y astronomía. Al descubrir algunos de
los cimientos del universo y la vida, y al desarrollar matemáticas que imitan su
funcionamiento, la Revolución Científica reintrodujo un sentido de predicibilidad
y control que hacía tiempo yacía durmiendo durante la Edad Media. Los seres
humanos pudieron alcanzar el dominio y la preeminencia sobre las vicisitudes e
imprevisibilidades de la naturaleza. La ruta hacia tal progreso debería venir de
medir, derribar (conocido variadamente hoy como reducir, descomponer o
deconstruir) y modelar matemáticamente el mundo a nuestro alrededor – para
seguidamente reconstruirlo en nuestros términos.
La mesurabilidad y el control son temas que animaron a la Revolución
Científica, y resuenan fuertemente hoy en día. Incluso las nociones de
dualismo (los mundos material y mental se encuentran separados) y la
deconstrucción (los “todos” pueden ser explicados por el arreglo y la interacción
de sus partes constituyentes a bajo nivel) han sobrevivido largamente a sus
iniciadores. La influencia de Descartes se juzga tan grande en parte debido a
que él la escribió en su lengua nativa, más que en Latin, presumiéndose por lo
tanto que amplió el acceso y la exposición popular a sus pensamientos. La
mecanización de la naturaleza desparramada por su dualismo, y los enormes
avances matemáticos de Newton y otros, lideraron siglos de progreso científico
sin precedentes, crecimiento económico y éxito de ingeniería. Como señalara
Fritjof Capra (1982), la NASA no habría tenido la posibilidad de poner un
hombre en la Luna sin Rene Descartes.
La herencia, sin embargo, es definitivamente una bendición mezclada. Los
Factores Humanos y la Seguridad de Sistemas están estancados con un
lenguaje, con metáforas e imágenes que enfatizan la estructura, componentes,
mecánicas, partes e interacciones, causa y efecto. Mientras nos dan la
dirección inicial para construir sistemas seguros y para figurarnos lo que estuvo
mal, cuando cambia no lo hacemos nosotros, hay límites para la utilidad de
este vocabulario heredado. Regresemos a ese día de verano de 1999 y a la
carrera en pista del MD-80. En buena tradición Newtoniana-Cartesiana,
podemos comenzar abriendo el avión un poco más, separar los diversos
componentes y procedimientos para ver como interactúan, segundo a segundo.
Inicialmente seremos alcanzados por el éxito empíricamente resonante – como
de hecho Descartes y Newton frecuentemente fueron. Pero cuando queremos
recrear el todo en base a las partes que encontramos, una realidad más
problemática salta a la vista: Ya no todo va bien. La exacta, matemáticamente
placentera separación entre causa humana y mecánica, entre episodios
sociales y estructurales, se ha derribado. El todo ya no se ve más como una
función linear de la suma de sus partes. Como explicara Scout Snook (2000),
los dos pasos clásicos occidentales de reducción analítica (el todo en partes) y
síntesis inductiva (las partes de vuelta en el todo nuevamente) parecen
funcionar, pero simplemente juntando las partes que encontramos no captura la
rica complejidad oculta dentro y alrededor del incidente. Lo que se necesita es
una integración orgánica, holística. Lo que tal vez es necesario es una nueva
forma de análisis y síntesis, sensible a la situación total de la actividad
sociotécnica organizada. Pero primero examinemos la historia analítica,
componencial.
SPOILERS, PROCEDIMIENTOS Y SISTEMAS HIDRÁULICOS
Los spoilers son esos flaps que se levantan al flujo de aire en la parte superior
de las alas, luego que la aeronave ha tocado tierra. No solo contribuyen a
frenar la aeronave al obstruir la corriente de aire, sino que además, causan que
el ala pierda la capacidad de crear sustentación, forzando el peso de la
aeronave en las ruedas. La extensión de los ground spoilers acciona además el
sistema de frenado automático en las ruedas. Mientras más peso llevan las
ruedas, más efectiva se vuelve su frenado. Antes de aterrizar, los pilotos
seleccionan el ajuste que desean en el sistema de frenado de ruedas
automático (mínimo, medio o máximo), dependiendo del largo y condiciones de
la pista. Luego del aterrizaje, el sistema automático de frenado de ruedas
disminuirá la velocidad de la aeronave sin que el piloto tenga que hacer algo, y
sin dejar que las ruedas deslicen o pierdan tracción. Como tercer mecanismo
para disminuir la velocidad, la mayoría de los aviones jet tiene reversores de
impulso, que direccional el flujo saliente de los motores jet en contra de la
corriente de aire, en vez de hacerlo salir hacia atrás.
En este caso, no salieron los spoilers, y como consecuencia, no se accionó el
sistema de frenado automático de ruedas. Al correr por la pista, los pilotos
verificaron el ajuste del sistema de frenado automático en mñultiples
oportunidades, para asegurarse que se encontraba armado e incluso
cambiando su ajuste a máximo, al ver acercarse el final de la pista. Pero nunca
engancharía. El único mecanismo remanente para disminuir la velocidad de la
aeronave era el empuje reverso. Los reversores, sin embargo, son más
efectivos a altas velocidades. Para el momento en que los pilotos se percataron
que no iban a lograrlo antes del final de la pista, la velocidad era ya bastante
baja (ellos terminaron saliendo al campo a 10-20 nudos) y los reversores no
tenían entonces un efecto inmediato. A medida que el jet salía por el borde de
la pista, el capitán cerraba los reversores y desplazaba la aeronave algo a la
derecha para evitar obstáculos.
¿Cómo se arman los spoilers? En el pedestal central, entre los dos pilotos, hay
una cantidad de palancas. Algunasson para los motores y reversores de
impulso, una es para los flaps, y una para los spoilers. Para armar los ground
spoilers, uno de los pilotos debe elevar la palanca. La palanca sube
aproximadamente una pulgada y permanece allí, armada hasta el toque a
tierra. Cuando el sistema sensa que la aeronave está en tierra (lo que hace en
parte mediante switches en el tren de aterrizaje), la palanca regresa
automáticamente y los spoilers salen. Asaf Degani, quien estudió tales
problemas procedimentales en forma extensa, ha llamado el episodio del
spoiler no como uno de error humano, sino uno de cronometraje (timing)
(ejemplo, Degani, Heymann & Shafto, 1999). En esta aeronave, como en
muchas otras, los spoilers no deberían ser armados antes que se haya
seleccionado el tren de aterrizaje abajo y se encuentre completamente en
posición. Esto tiene que ver conlos switches que pueden indicar cuando la
aeronave se encuentra en tierra. Estos son switches que se comprimen a
medida que el peso de la aeronave se asienta en las ruedas, pero no sólo en
esas circunstancias. Existe un riesgo en este tipo de aeronave, que el switch en
el tren de nariz se comprima incluso mientras el tren de aterrizaje está saliendo
de su asentamiento. Ello puede ocurrir debido a que el tren de nariz se
despliega en la corriente de aire de impacto. A medida que el tren de aterrizaje
está saliendo y la aeronave se desliza en el aire a 180 nudos, la pura fuerza del
viento puede comprimir el tren de nariz, activar el switch y seguidamente
arriesgar extendiendo los ground spoilers (si se encontrasen armados). No es
una buena idea: La aeronave podría tener problemas volando con los ground
spoilers fuera. Por lo tanto, el requerimiento: El tren de aterrizaje necesita estar
durante todo su recorrido hacia fuera, apuntando abajo. Sólo cuando no exista
más riesgo de compresión del switch aerodinámica, los spoilers pueden ser
armados. Este es el orden de los procedimientos before-landing:
Gear down and locked.
Spoilers armed.
Flaps FULL.
En una aproximación típica, los pilotos seleccionan abajo la manivela del tren
de aterrizaje cuando el llamado glide slope se encuentra vivo: cuando la
aeronave ha entrado en el rando de la señal electrónica que la guiará hacia
abajo a la pista. Una vez que el tren de aterrizaje se encuentra abajo, los
spoilers deben ser armados. Entonces, una vezque la aeronave captura ese
glide slope (por ejemplo, está exactamente en la marcación elecrónica) y
comienza a descender en la aproximación a la pista, los flaps necesitan ser
ajustados a FULL (típicamente 40º). Los flaps son otros aparatos que se
extienden desde el ala, cambiando su forma y tamaño. Ellos permiten a la
aeronave volar más lento para un aterrizaje. Esto condiciona los
procedimientos al contexto. Ahora se ve así:
Gear down and locked (cuando el glide slope esté vivo).
Spoilers armed (cuando el tren esté abajo y asegurado).
Flaps FULL (cuando el glide slope esté capturado).
¿Pero cuanto toma pasar desde “glide slope vivo” a “glide slope capturado”? En
una típica aproximación (dada la velocidad) esto toma alrededor de 15
segundos.
En un simulador, donde toma lugar el entrenamiento, esto no crea problema. El
ciclo completo (desde la palanca del tren abajo hasta la indicación “gear down
and locked” en la cabina), toma alrededor de 10 segundos. Eso deja 5
segundos para armar los spoilers, antes que la tripulación necesite seleccionar
flaps FULL (el ítem siguiente en los procedimientos). En el simulador, entonces,
las cosas se ven como esto:
En t = 0 Gear down and locked (cuando el glide slope esté vivo).
En t + 10 Spoilers armed (cuando el tren esté abajo y asegurado).
En t + 15 Flaps FULL (cuando el glide slope esté capturado).
Pero en una aeronave real, el sistema hidráulico (que, entre otras cosas,
extiende el tren de aterrizaje), no es tan efectivo como en un simulador. El
simulador, desde luego, solo simula los sistemas hidráulicos de la aeronva,
modelado en como se encuentra la aeronave cuando tiene cero horas voladas,
cuendo está reluciente de nuevo, salido de fábrica. En una aeronave más vieja,
puede tomar hasta medio minuto al tren realizar el ciclo y quedar asegurado.
Ello hace que los procedimientos se vean algo así:
En t = 0 Gear down and locked (cuando el glide slope esté vivo).
En t + 30 Spoilers armed (cuando el tren esté abajo y asegurado).
¡Pero! en t + 15 Flaps FULL (cuando el glide slope esté capturado).
En efecto, entonces, el ítem “flaps” en los procedimientos molesta antes del
ítem “spoilers”. Una vez que el ítem “flaps” está completo y la aeronave
desciende hacia la pista, es fácil continuar con los procedimientos desde allí,
con los ítems siguientes. Los spoilers nunca se arman. Su armado ha caído
entre los quiebres de una combadura de tiempo. Una exclusiva declaración de
error humano (o quiebre de CRM), se vuelve más difícil de sostener frente a
este trasfondo. ¿Qué tanto error humano hubo, en verdad? Permanezcamos
dualistas por ahora y revisitemos la Ecuación 1. Ahora apliquemos una
definición más liberal de falla mecánica. El tren de nariz de la aeronave real,
ajustado con un switch de compresión, está diseñado de forma tal que se
pueda desplegar en el viento mientras se vuela. Esto introduce una
vulnerabilidad mecánica sistemática que solamente es tolerada mediante
pausas procedimentales (un mecanismo de conocidos agujeros contra la falla):
primero el tren, luego los spoilers. En otras palabras, “gear down and locked”
es un prerrequisito mecánico para el armado de los spoilers, pero el ciclo
completo del tren puede tomar más tiempo del figurado en los procedimientos y
las pausas de eventos que dirigen su aplicación. El sistema hidráulico de los
viejos jets no presuriza tan bien: Puede tomar hasta 30 segundos para un tren
de aterrizaje realizar el ciclo hacia fuera. El simulador de vuelo, en contraste,
realiza el mismo trabajo dentro de 10 segundos, dejando una sutil pero
sustantiva incongruencia. Una secuencia de trabajo es introducida y practicada
durante el, mientras que una delicadamente diferente es necesaria para las
operaciones reales. Mas aún, esta aeronave tiene un sistema que advierte si
los spoilers no están armados en el despegue, pero no tiene un sistema para
advertir que los spoilers no están armados en la aproximación. Entonces ahí
está el arreglo mecánico en el cockpit. La palanca de spoiler armado luce
diferente de la de spoiler no armado sólo por una pulgada y un pequeño
cuadrado rojo en el fondo. Desde la posición del piloto en el asiendo derecho
(quien necesita confirmar su armado), este parche rojo se oscurece detrás de
las palancas de potencia mientras estas se encuentran en la posición típica de
aproximación. Con tanta contribución mecánica alrededor (diseño del tren de
aterrizaje, sistema hidráulico erosionado, diferencias entre el simulador y la
aeronave real, distribución de las palancas del cockpit, falta de un sistema de
advertencia de los spoilers durante la aproximación, pausas en los
procedimientos) y una contribución de planificación estocástica (dos capitantes
en este vuelo), una falla mecánica de mucho mayor magnitud podría ser
adherida a la ecuación para rebalancear la contribución humana.
Pero eso todavía es dualista. AL reensamblar las partes que encontramos entre
procedimientos, pausas, erosión mecánica, trade-offs de diseño, podemos
comenzar a preguntar donde realmente terminan las contribuciones mecánicas,
y donde comienzan las contribuciones humanas. La frontera ya no está tan
clara. La carga impuesta por un viento de 180 nudos en la rueda de nariz se
transfiere a un débil procedimiento: primero el tren, luego los spoilers. La rueda
de nariz, desplegándose al viento y equipada con un switch de compresión, es
incapaz de acarrear esa carga y garantizar que los spoilers no se extenderán,
por lo que en su lugar, un procedimiento tiene que llevar la carga. La palanca
del spoiler está ubicada en una forma que hace difícil su verificación, y un
sistema de advertencia para spoilers no armados no se encuentra instalado.
Nuevamente, el error está suspendido, inestable, entre la intención humana y el
hardware de ingeniería – pertenece a ambos y a ninguno únicamente. Y
entonces está esto: El desgaste gradual de un sistema hidráulico no es algo
que haya sido tomado en cuenta durante la certificación del jet. Un MD-80 con
un sistema hidráulico anémico que toma más de medio minuto para llevar todo
el tren fuera, abajo y asegurado, violando el requerimiento de diseño original
por un factor de tres, aún se considera aeronavegable. El sistema hidráulico
desgastado no puede ser coniderado una falla mecánica. No deja al jet en
tierra. Ni tampoco lo hace la palanca del spoiler de difícil verificación, ni la falta
de un sistema de advertencia durante la aproximación. El jet fue certificado
como aeronavegable con o sin todo ello. Que no haya falla mecánica, en otras
palabras, no es porque no existan asuntos mecánicos. No existe falla mecánica
por los sistemas sociales, hechos por los manufactureros, reguladores, y
operadores prospectivos – indudablemente formados por preocupaciones
prácticas y expresado a través de juicio de ingeniería situado con incertidumbre
sobre el desgaste futuro – decidieron que ahí no podría haber ninguno (al
menos no relacionado con los asuntos ahora identificados en la corrida de un
MD-80). ¿Dónde termina la falla mecánica y comienza el error humano? Al
excavar sólo lo suficientemente profundo, la pregunta se vuelve imposible de
responder.

More Related Content

PDF
GrueBleen | 2017 | Environmental
PDF
Ficha informativa_ Esquematização de Determinantes
DOCX
Final shots i used
PPTX
Cover music analysis final
PDF
μηχανογραφικο
PDF
Dr OHalloran Recommendation
PDF
SOCIAL PROFILE
DOCX
Contents page
GrueBleen | 2017 | Environmental
Ficha informativa_ Esquematização de Determinantes
Final shots i used
Cover music analysis final
μηχανογραφικο
Dr OHalloran Recommendation
SOCIAL PROFILE
Contents page

Viewers also liked (9)

DOCX
McElvy Resume 2015
PDF
Academic Results - University
PPT
DOCX
Bayo OgunboteCURRICULUM VITAE
PDF
Presentación London Supply Group
PDF
រឿងប្រលោមលោក ស្នាមញញឹម
PDF
Invitacion congreso
PPTX
Elviajede lastrón.ppt
PDF
Compostarc 2013: Resultats de la Recollida Selectiva de la FORM a Catalunya
McElvy Resume 2015
Academic Results - University
Bayo OgunboteCURRICULUM VITAE
Presentación London Supply Group
រឿងប្រលោមលោក ស្នាមញញឹម
Invitacion congreso
Elviajede lastrón.ppt
Compostarc 2013: Resultats de la Recollida Selectiva de la FORM a Catalunya
Ad

Similar to libro error-humano-dekker (20)

DOCX
COGNITIVE REPAIRSHOW ORGANIZATIONAL PRACTICESCANCOMPENS.docx
DOCX
COGNITIVE REPAIRSHOW ORGANIZATIONAL PRACTICESCANCOMPENS.docx
PDF
Behavioral economics
DOCX
bhusal2Prepared byDeepak BhusalCWID50259419To P
DOCX
Bhusal2 prepared bydeepak bhusalcwid50259419to p
PDF
Essay Linking Sentences Examples - How To Begin A New Paragraph. Useful ...
PDF
"Integrated Information vs. Quantum Models: The Consciousness Debate"
PDF
Extending the Mind with Cognitive Prosthetics?
PDF
Control Theory for Humans Quantitative Approaches To Modeling Performance 1st...
DOCX
Mario TamaGetty Images NewsGetty ImagesLearning Objectiv.docx
DOCX
Mario TamaGetty Images NewsGetty ImagesLearning Objectiv.docx
PDF
Tychomancy Inferring Probability From Causal Structure Michael Strevens
PDF
Descriptive Essay Sample About A Place
PDF
Teleology in Evolution - A Presentation for the New Orleans C G Jung Society
PDF
Butterfly paper
PDF
Agency And Causation In The Human Sciences Francesca Castellani
PPTX
Politics and Pragmatism in Scientific Ontology Construction
PDF
Psychology_of_Intelligence_Analysis .pdf
PDF
Personal construct theory wikipedia, the free encyclopedia
COGNITIVE REPAIRSHOW ORGANIZATIONAL PRACTICESCANCOMPENS.docx
COGNITIVE REPAIRSHOW ORGANIZATIONAL PRACTICESCANCOMPENS.docx
Behavioral economics
bhusal2Prepared byDeepak BhusalCWID50259419To P
Bhusal2 prepared bydeepak bhusalcwid50259419to p
Essay Linking Sentences Examples - How To Begin A New Paragraph. Useful ...
"Integrated Information vs. Quantum Models: The Consciousness Debate"
Extending the Mind with Cognitive Prosthetics?
Control Theory for Humans Quantitative Approaches To Modeling Performance 1st...
Mario TamaGetty Images NewsGetty ImagesLearning Objectiv.docx
Mario TamaGetty Images NewsGetty ImagesLearning Objectiv.docx
Tychomancy Inferring Probability From Causal Structure Michael Strevens
Descriptive Essay Sample About A Place
Teleology in Evolution - A Presentation for the New Orleans C G Jung Society
Butterfly paper
Agency And Causation In The Human Sciences Francesca Castellani
Politics and Pragmatism in Scientific Ontology Construction
Psychology_of_Intelligence_Analysis .pdf
Personal construct theory wikipedia, the free encyclopedia
Ad

Recently uploaded (20)

PDF
Artificial Superintelligence (ASI) Alliance Vision Paper.pdf
PDF
Level 2 – IBM Data and AI Fundamentals (1)_v1.1.PDF
PDF
Abrasive, erosive and cavitation wear.pdf
PPTX
Fundamentals of safety and accident prevention -final (1).pptx
PPTX
Fundamentals of Mechanical Engineering.pptx
PDF
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
PDF
737-MAX_SRG.pdf student reference guides
PDF
EXPLORING LEARNING ENGAGEMENT FACTORS INFLUENCING BEHAVIORAL, COGNITIVE, AND ...
PPTX
Graph Data Structures with Types, Traversals, Connectivity, and Real-Life App...
PDF
August 2025 - Top 10 Read Articles in Network Security & Its Applications
PPT
INTRODUCTION -Data Warehousing and Mining-M.Tech- VTU.ppt
PPTX
CURRICULAM DESIGN engineering FOR CSE 2025.pptx
PPTX
Module 8- Technological and Communication Skills.pptx
PPTX
Management Information system : MIS-e-Business Systems.pptx
PDF
Categorization of Factors Affecting Classification Algorithms Selection
PPT
Total quality management ppt for engineering students
PPTX
AUTOMOTIVE ENGINE MANAGEMENT (MECHATRONICS).pptx
PDF
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
PDF
A SYSTEMATIC REVIEW OF APPLICATIONS IN FRAUD DETECTION
PPTX
Sorting and Hashing in Data Structures with Algorithms, Techniques, Implement...
Artificial Superintelligence (ASI) Alliance Vision Paper.pdf
Level 2 – IBM Data and AI Fundamentals (1)_v1.1.PDF
Abrasive, erosive and cavitation wear.pdf
Fundamentals of safety and accident prevention -final (1).pptx
Fundamentals of Mechanical Engineering.pptx
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
737-MAX_SRG.pdf student reference guides
EXPLORING LEARNING ENGAGEMENT FACTORS INFLUENCING BEHAVIORAL, COGNITIVE, AND ...
Graph Data Structures with Types, Traversals, Connectivity, and Real-Life App...
August 2025 - Top 10 Read Articles in Network Security & Its Applications
INTRODUCTION -Data Warehousing and Mining-M.Tech- VTU.ppt
CURRICULAM DESIGN engineering FOR CSE 2025.pptx
Module 8- Technological and Communication Skills.pptx
Management Information system : MIS-e-Business Systems.pptx
Categorization of Factors Affecting Classification Algorithms Selection
Total quality management ppt for engineering students
AUTOMOTIVE ENGINE MANAGEMENT (MECHATRONICS).pptx
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
A SYSTEMATIC REVIEW OF APPLICATIONS IN FRAUD DETECTION
Sorting and Hashing in Data Structures with Algorithms, Techniques, Implement...

libro error-humano-dekker

  • 2. TEN QUESTIONSABOUT HUMAN ERROR: A new view of human factors and system safety Contenidos Reconocimientos Prefacio Introducción de la serie Nota del autor 1 ¿Fue Falla Mecánica o Error Humano? 2 ¿Por qué Fallan los Sistemas Seguros? 3 ¿Por qué son más Peligrosos los Doctores que los Propietarios de Armas? 4 ¿No Existen los Errores? 5 Si Ud. Pierde la Conciencia Situacional, ¿Qué la Reemplaza? 6 ¿Por qué los Operadores se vuelven Complacientes? 7 ¿Por qué no siguen ellos los Procedimientos? 8 ¿Podemos Automatizar los Errores Humanos Fuera del Sistema? 9 ¿Va a ser Seguro el Sistema? 10 ¿Debemos Hacer a la Gente Responsable por sus Errores? Referencias Índice del Autor Índice por Objetivo Reconocimientos. Tal como los errores, las ideas vienen de algún lado. Las ideas en este libro fueron desarrolladas en un período de años en que las discusiones con las siguientes personas fueron particularmente constructivas: David Woods, Erik Hollnagel, Nancy Leveson, James Nyce, John Flach, Gary Klein, Diane Vaughan, y Charles Billings. Jens Rasmussen ha estado siempre delante en el juego en ciertas formas: Algunas de las preguntas sobre el error humano ya fueron tomadas por él en décadas pasadas. Erik Hollnagel fue instrumental en contribuir a moldear las ideas en el capítulo 6, y Jim Nyce ha tenido una influencia significativa en el capítulo 9. Quiero agradecer también a mis estudiantes, particularmente a Arthur Dijkstra y Margareta Lutzhoft, por sus comentarios en borradores previos y sus útiles sugerencias. Margareta merece especial gratitud por su ayuda en decodificar el caso estudiado en el capítulo 5, y Arthur, por su habilidad para señalar ―Ansiedad Cartesiana‖, donde yo no la reconocí. Agradecimiento especial al editor de series Barry Kantowitz y al editor Bill Webber, por su confianza en el proyecto. El trabajo para este libro fue apoyado por una subvención del Swedish Flight Safety Directorate. Prefacio Los factores humanos en el transporte siempre han sido relacionados con el error humano. De hecho, como un campo de investigación científica, debe su inclusión a investigaciones de error de piloto y a la subsiguiente insatisfacción de los investigadores con la etiqueta. En 1947, Paul Fitts y Richard Jones, construyendo sobre trabajo pionero por Alphonse Chapanis, demostraron como características de las cabinas de aviones de la II Guerra Mundial influenciaban sistemáticamente la forma en que pilotos cometían errores. Por ejemplo, pilotos confundían las palancas de flaps y tren de aterrizaje porque éstas a menudo se veían y se sentían igual, y estaban ubicadas próximas una de otra (switches idénticos, o palancas muy similares). En el incidente típico, un piloto podía levantar el tren de aterrizaje en vez de los flaps, luego de un aterrizaje, con las previsibles consecuencias para hélices, motores y estructura. Como un inmediato arreglo de guerra, una rueda de goma fue adherida al control del tren de aterrizaje y un pequeño terminal con forma de cuña, al control del flan. Esto básicamente solucionó el problema y el arreglo de diseño eventualmente llegó a ser un requerimiento de certificación. Los pilotos podían también mezclar los controles de acelerador, mezcla y hélice, ya que sus ubicaciones cambiaban en diferentes cabinas. Tales errores no fueron sorprendentes, degradaciones aleatorias de desempeño humano. De preferencia, ellos fueron acciones y cálculos que tuvieron sentido una vez que los investigadores comprendieron las características del mundo en que las personas trabajaban, una vez que ellos hubieron analizado la situación que rodeaba al operador.
  • 3. Los errores humanos están sistemáticamente conectados a características de las herramientas y tareas de las personas. Puede ser difícil predecir cuándo o qué tan a menudo ocurrirán los errores (a pesar que las técnicas de fiabilidad humana ciertamente han intentado). Con una examinación crítica del sistema en que las personas trabajan, sin embargo, no es tan difícil anticipar dónde ocurrirán los errores. Factores Humanos ha utilizado esta premisa desde siempre: La noción de diseñar sistemas resistentes y tolerantes al error se basa en ello. Factores Humanos fue precedido por una era de hielo mental de comportamientismo, en que cualquier estudio de la mente era visto como ilegítimo y no científico. El comportamientismo en sí ha sido una psicología de protesta, acuñada en agudo contraste entre la introspección experimental de Wundt que la precedió. Si el comportamientismo fue una psicología de protesta, entonces factores humanos fue una psicología pragmática. La Segunda Guerra Mundial trajo tal furioso paso de desarrollo tecnológico que el comportamientismo fue encontrado manos abajo. Surgieron problemas prácticos en la vigilancia y toma de decisiones del operador que fueron totalmente inmunes al repertorio de exhortaciones motivacionales del comportamiento de Watson. Hasta ese punto, la psicología había asumido ampliamente que el mundo estaba arreglado, y que los humanos tenían que adaptarse a sus demandas a través de la selección y el entrenamiento. Factores Humanos mostró que el mundo no estaba arreglado: Cambios en el ambiente podrían fácilmente llevar a incrementos en el desempeño no alcanzables mediante intervenciones comportamientistas. En el comportamientismo, el rendimiento tenía que ser adaptado luego de las características del mundo. En factores humanos, características del mundo fueron adaptadas luego de los límites y capacidades del desempeño. Como una psicología de lo pragmático, factores humanos adoptó la visión de ciencia y método científico Cartesiano-Newtoniano (tal como Wundt y Watson habían hecho). Descartes y Newton fueron jugadores dominantes en la revolución científica del siglo XVII. Esta transformación total en el pensamiento instaló una creencia en la absoluta certeza del conocimiento científico, especialmente en la cultura occidental. El ánimo de la ciencia fue de alcanzar el control al derivar leyes de la naturaleza generales e idealmente matemáticas (tal como nosotros intentamos hacer para el desempeño de la persona y el sistema). Una herencia de esto puede ser vista todavía en factores humanos, particularmente en la predominación de experimentos, lo nomotético más que la inclinación ideográfica de su investigación y una fuerte fe en el realismo de los hechos observados. También puede ser reconocida en las estrategias reductivas con que se relacionan los factores humanos y la seguridad operacional de sistemas para lidiar con la complejidad. La solución de problemas Cartesiano-Newtoniana es analítica. Consiste en romper con los pensamientos y problemas en piezas y en arreglarlas en algún orden lógico. El fenómeno necesita ser descompuesto en partes más básicas y su totalidad puede ser explicada exhaustivamente haciendo referencia a sus componentes constituyentes y sus interacciones. En factores humanos y seguridad operacional de sistemas, se entiende mente como una construcción tipo caja, con un intercambio mecánico en representaciones internas; trabajo está separado en pasos procedimentales a través de análisis de tareas jerárquicos; las organizaciones no son orgánicas o dinámicas, sino que están constituidas por estratos estáticos y compartimientos y lazos; y seguridad operacional es una propiedad estructural que puede ser entendida en términos de sus mecanismos de orden más bajo (sistemas de reporte, tasas de error y auditorías, la función de la administración de seguridad operacional en el diagrama organizacional, y sistemas de calidad). Estas visiones están con nosotros hoy. Dominan el pensamiento en factores humanos y seguridad operacional de sistemas. El problema es que extensiones lineares de estas mismas nociones no pueden trasladarnos dentro del futuro. Las una vez pragmáticas ideas de factores humanos y seguridad de sistemas están cayendo detrás de los problemas prácticos que han comenzado a surgir en el mundo de hoy. Podríamos estar dentro para una repetición de los cambios que vinieron con los desarrollos tecnológicos de la II Guerra Mundial, donde el comportamientismo mostró quedar corto. Esta vez podría ser el caso de factores humanos y seguridad operacional de sistemas. Los desarrollos contemporáneos, sin embargo, no son solo técnicos. Hay sociotécnicos: La comprensión sobre qué hace a los sistemas seguros o frágiles requiere más que conocimiento sobre la interfase hombre-máquina. Como David Meister señaló recientemente (y el ha estado cerca por un tiempo), factores humanos no ha progresado mucho desde 1950. ―Hemos tenido 50 años de investigación‖, él se pregunta retóricamente, ―¿pero cuanto más de lo que sabíamos en un principio, sabemos?‖ (Meister 2003, p. 5). No es que las propuestas tomadas por factores humanos y seguridad operacional de sistemas ya no sean útiles, sino que su utilidad sólo puede ser apreciada realmente cuando vemos sus límites. Este libro no es sino un capítulo en una transformación más larga que ha comenzado a identificar las profundamente enraizadas restricciones y los nuevos puntos de influencia en nuestras visiones de factores humanos y seguridad operacional de sistemas.
  • 4. Las 10 preguntas acerca del error humano no son solo preguntas sobre el error humano como un fenómeno, si es que lo son (y si el error humano es algo en y por sí mismo, en primer lugar). En realidad son preguntas acerca de factores humanos y seguridad operacional de sistemas como disciplinas, y en qué lugar se encuentran hoy. En formular estas preguntas acerca del error, y en trazar las respuestas a ellas, este libro intenta mostrar dónde nuestro pensamiento corriente está limitado; dónde nuestro vocabulario, nuestros modelos y nuestras ideas están limitando el progreso. En cada capítulo, el libro intenta entregar indicaciones para nuevas ideas y modelos que tal vez se las puedan arreglar mejor con la complejidad de los problemas que nos encaran ahora. Uno de esos problemas es que sistemas aparentemente seguros pueden desviarse y fallar. Desviarse en dirección a los márgenes de seguridad operacional ocurre bajo presiones de escasez y competencia. Está relacionado con la opacidad de sistemas socio técnicos grandes y complejos, y los patrones de información en que los integrantes basan sus decisiones y tratos. Derivaren fallas está asociado con los procesos organizacionales normales de adaptación. Las fallas organizacionales en sistemas seguros no están precedidas por fallas; lo están por el quiebre o carencia de calidad de componentes aislados. De hecho, la falla organizacional en sistemas seguros está precedida por trabajo normal, por personas normales haciendo trabajo normal en organizaciones aparentemente normales. Esto aparenta competir severamente con la definición de un incidente, y puede minar el valor de reportar incidentes como una herramienta para aprender más allá de un cierto nivel de seguridad operacional. El margen entre el trabajo normal y el incidente es claramente elástico y sujeto a revisión incremental. Con cada pequeño paso fuera de las normas previas, el éxito pasado puede ser tomado como una garantía de seguridad operacional futura. El incrementalismo mella el sistema completo crece de la línea de derrumbe, pero sin indicaciones empíricas poderosas de que está encaminado de esa forma. Modelos corrientes de factores humanos y seguridad operacional de sistemas no pueden lidiar con la derivación hacia fallas. Ellos requieren fallas como un prerrequisito para las fallas. Ellos aún están orientados hacia el encuentro de fallas (por ejemplo, errores humanos, hoyos en las capas de defensa, problemas latentes, deficiencias organizacionales y patógenos residentes), y se relacionan con niveles de trabajo y estructura dictados externamente, por sobre tomar cuentas internas (sobre qué es una falla vs. Trabajo normal) como cánones. Procesos de toma de sentido, de la creación de racionalidad local por quienes de verdad realizan los miles de pequeños y mayores tratados que transportan un sistema a lo largo de su curso de deriva, yacen fuera del léxico actual de factores humanos. Los modelos corrientes típicamente ven a las organizaciones como máquinas Newtonianas-cartesianas con componentes y nexos entre ellas. Los contratiempos son modelados como una secuencia de eventos (acciones y reacciones) entre un disparador y un resultado. Tales modelos no pueden pronunciarse acerca de la construcción de fallas latentes, ni sobre la gradual, incremental soltura o pérdida de control. Los procesos de erosión de las restricciones, de detrimentos de la seguridad operacional, de desviación hacia los márgenes no pueden ser capturados porque los enfoques estructurales son metáforas estáticas para formas resultantes, no modelos dinámicos orientados hacia procesos de formación. Newton y Descartes, con sus particulares estudios en ciencias naturales, tienen una firme atadura en factores humanos, seguridad operacional de sistemas y también en otras áreas. El paradigma de procesamiento de información, por ejemplo, tan útil para explicar tempranamente los problemas de transferencia de información entre radar y radio operadores en la II Guerra Mundial, solo ha colonizado la investigación de factores humanos. Aún es una fuerza dominante, reforzado por los experimentos del Spartan laboratory, que parecen confirmar su utilidad y validez. El paradigma tiene mente mecanizada, partida en componentes separados (por ejemplo, memoria de trabajo, memoria de corto plazo y memoria de largo plazo) con nexos entre medio. Newton habría amado su mecánica. A Descartes también le habría gustado: Una separación clara entre mente y mundo solucionaba (o circunvalada, más bien) una serie de problemas asociados con las transacciones entre ambos. Un modelo mecánico tal como procesamiento de información, claro que mantiene apego especial por la ingeniería y otros consumidores de los resultados de la investigación de factores humanos. Dictado pragmático salvando las diferencias entre práctica y ciencia, y el tener un modelo cognitivo similar a un aparato técnico familiar para gente aplicada, es una forma poderosa de hacer sólo eso. Pero no existe razón empírica para restringir nuestra comprensión de actitudes, memorias o heurísticos, como disposiciones codificadas mentalmente, como ciertos contenidos de conciencia con determinadas fechas de vencimiento. De hecho, tal modelo restringe severamente nuestra habilidad para comprender cómo las personas utilizan el habla y la acción para construir un orden perceptual y social; cómo, a través del discurso y la acción, las personas crean los ambientes que, a cambio, determinan la acción posterior y asesorías posibles, y que restringen lo que, en consecuencia, será visto como discurso aceptable o decisiones racionales.
  • 5. No podemos comenzar a entender la deriva en fallas, sin comprender cómo grupos de personas, a través de cálculo y acción, ensamblan versiones del mundo en las que ellos calculan y actúan. El procesamiento de la información cabe dentro de una perspectiva metateórica mayor y dominante, que toma al individuo como su foco central (Heft, 2001). Esta visión, también, es una herencia de la Revolución Científica, la que ha popularizado incrementadamente la idea humanista de un ―individuo auto contenido‖. Para la mayoría de la psicología, esto ha significado que todos los procesos dignos de estudio toman lugar dentro de los márgenes del cuerpo (o mente), algo epitomizado por el enfoque mentalista del procesamiento de información. En su incapacidad para tratar significativamente la deriva hacia la falla, que interconecta factores individuales, institucionales, sociales y técnicos, los factores humanos y la seguridad operacional de sistemas están actualmente pagando por su exclusión teórica de los procesos sociales y transaccionales, entre los individuos y el mundo. El componencialismo y la fragmentación de la investigación de factores humanos aún es un obstáculo al progreso en este sentido. Un estiramiento de la unidad de análisis (como lo hecho en las ideas de ingeniería de sistemas cognitivos y cognición distribuida), y una llamada actuar centralmente en comprender los cálculos y el pensamiento, han sido formas de lidiar con los nuevos desarrollos prácticos para los que los factores humanos y la seguridad de sistemas, no estaban preparados. El énfasis individualista del protestantismo y la iluminación, también reboza de ideas sobre control y culpa. ¿Debemos culpar a las personas por sus errores? Los sistemas sociotécnicos han crecido en complejidad y tamaño, moviendo a algunos a decir que no tiene sentido esperar o demandar de los integrantes (ingenieros, administradores, operadores), que giren en torno a algún ideal moral reflectivo. Presiones de escasez y competencia, han logrado convertirse insidiosamente en mandatos organizacionales e individuales, los que a cambio, restringen severamente la racionalidad y opciones (y por ende autonomía), de todos los actores en el interior. Ya sólo los antihéroes continúan teniendo roles líderes en nuestras historias de fallas. El individualismo aún es crucial para la propia identidad en la modernidad. La idea que lleva a un trabajo de equipo, a una organización entera, o a toda una industria a quebrar un sistema (como se ilustró mediante casos de deriva en fallas) es muy poco convencional respecto de nuestras preconcepciones culturales heredadas. Incluso antes que llegáramos a episodios complejos de acción y responsabilidad, podemos reconocer la prominencia de la de-construcción y componentalismo Newtoniano-Cartesianos, en mucha investigación de factores humanos. Por ejemplo: Las nociones empíricas de una percepción de elementos que gradualmente se fueron convirtiendo en significado, a través de etapas de procesamiento mental, son nociones teóricas legítimas hoy. El empirismo fue otrora una fuerza en la historia de la psicología. Incluso atascado por el paradigma de procesamiento de información, sus principios centrales han retrocedido, por ejemplo, en teorías de conciencia situacional. En adoptar un modelo cultural como tal, desde una comunidad aplicada y sometiéndolo a escrutinio científico putativo, por supuesto que los factores humanos encuentran su ideal pragmático. Los modelos culturales abarcan los problemas de factores humanos como una disciplina aplicada. Pocas teorías pueden cubrir el abismo entre investigador y practicantes mejor que aquellas que aplican y disectan los vernáculos practicantes para estudio científico. Pero los modelos culturales vienen con una etiqueta de precio epistemiológica. La investigación que se adjudica indagar un fenómeno (digamos, conciencia situacional dividida, o complacencia), pero que no definen ese fenómeno (porque, como modelo cultural, se supone que todos saben lo que significa) no puede falsear el contacto con la realidad empírica. Ello deja a tal investigador de factores humanos sin el mecanismo mayor de control científico desde Kart Popper. Conectado al procesamiento de información, y al enfoque experimental a muchos problemas de factores humanos, está un prejuicio cuantitativo, primero competido en la psicología por Wilhelm Wundt, en su laboratorio en Leipzig. A pesar de que Wundt rápidamente tuvo que admitir que una cronometría de la mente era una meta muy audaz de la investigación, los proyectos experimentales de investigación sobre factores humanos aún pueden reflejar versiones pálidas de su ambición. Contar, medir, categorizar y analizar estadísticamente, son herramientas gobernantes del tratado, mientras que las investigaciones cualitativas son a menudo desechadas por subjetivas y no científicas. Los factores humanos tienen una orientación realista, creyendo que los hechos empíricos son aspectos estables y objetivos de la realidad que existe independiente del observador o su teoría. Nada de esto hace menos reales los hechos generados mediante experimentos, para aquellos que observan, publican, o leen acerca de ellos. Sin embargo, al apreciar a Thomas Kuhn (1962), esta realidad debe ser vista por lo que es: un acuerdo negociado implícitamente entre investigadores de pensamientos similares, más que un común denominador accesible a todos. No hay arbitrio final aquí. Es posible que un enfoque experimental, componencial, pueda disfrutar de un privilegio epistemiológico. Pero ello también significa que no hay imperativo automático para únicamente sostenerse por la investigación legítima, como se ve a veces en la corriente principal de factores humanos. Las formas de obtener acceso a la realidad empírica son infinitamente negociables, y su aceptación es una función de qué tan bien ellos dan conformidad a la visión mundial de aquellos a quienes el investigador apela.
  • 6. La persistente supremacía cuantitativista (particularmente en los factores humanos norteamericanos), se ve apesadumbrada con este tipo de autoridad consensuada (debe ser bueno porque todos lo están haciendo). Tal histéresis metodológica podría tener que ver más con los miedos primarios de ser marcado ―no científico‖ (los miedos compartidos por Wundt y Watson) que con un retorno estable de incrementos significativos de conocimiento generados por la investigación. El cambio tecnológico dio impulso a los pensamientos de factores humanos y seguridad de sistemas. Las demandas prácticas puestas por los cambios tecnológicos envolvieron a los factores humanos y la seguridad de sistemas con el espíritu pragmático que hasta hoy tienen. Pero lo pragmático no es más pragmático si no encaja con las demandas creadas por aquello que está sucediendo ahora en nuestro alrededor. El paso del cambio sociotecnológico no tiende a desacelerar pronto. Si creemos que la II Guerra Mundial generó una gran cantidad de cambios interesantes, dando a luz a los factores humanos como una disciplina, entonces podríamos estar viviendo en tiempos incluso más excitantes hoy. Si nosotros nos mantenemos haciendo lo que hemos estado realizando en factores humanos y seguridad de sistemas, simplemente porque nos ha funcionado en el pasado, podríamos llegar a ser uno de esos sistemas que derivan hacia la falla. Lo pragmático requiere que nosotros nos adaptemos también, para arreglárnoslas mejor con la complejidad del mundo que nos enfrenta hoy. Nuestros éxitos pasados no son garantía de continuar logros futuros. Prólogo de la serie. Barry H. Kantowitz Battelle Human Factors Transportation Center El rubro del transporte es importante, por razones tanto prácticas como teóricas. Todos nosotros somos usuarios de sistemas de transporte como operadores, pasajeros y consumidores. Desde un punto de vista científico, el rubro del transporte ofrece una oportunidad de crear y probar modelos sofisticados de comportamiento y cognición humanos. Esta serie cubre los aspectos práctico y teórico de los factores humanos en el transporte, con un énfasis en su interacción. La serie es interpretada como un foro para investigadores e ingenieros interesados en cómo funcionan las personas dentro de sistemas de transporte. Todos los modos de transporte son relevantes, y todos los esfuerzos en factores humanos y ergonomía que tienen implicancias explícitas para los sistemas de transporte caen en una visión pobre en serie. Esfuerzos analíticos son importantes para relacionar teoría y datos. El nivel de análisis puede ser tan pequeño como una persona, o de espectro internacional. Los datos empíricos pueden provenir de un amplio rango de metodologías, incluyendo investigación de laboratorio, estudios simulados, seguimiento de pruebas, pruebas operacionales, trabajo en el campo, revisiones de diseños, o peritajes. Este amplio espectro es interpretado para maximizar la utilidad de la serie para lectores con trasfondos distintos. Espero que la serie sea útil para profesionales en las disciplinas de factores humanos, ergonomía, ingeniería de transportes, psicología experimental, ciencia cognitiva, sociología e ingeniería de seguridad operacional. Está orientada a la apreciación de especialistas de transporte en la industria, gobierno, o académicos, así como también, al investigador en busca de una base de pruebas para nuevas ideas acerca de la interfase entre las personas y sistemas complejos. Este libro, mientras se enfoca en el error humano, ofrece una visión de sistema particularmente bienvenida en los factores humanos del transporte. Una meta mayor de esta serie de libros es relacionar la teoría y la práctica de factores humanos. El autor es encomendado para formular preguntas que no sólo relacionan teoría y práctica, sino que fuerzan al lector a evaluar las clases de teoría como las aplicadas a factores humanos. Los enfoques de información tradicionales, derivados del modelo de canal limitado que ha formado las bases originales para el trabajo teórico en factores humanos, son escrutados. Enfoques más nuevos, tales como la conciencia situacional, que procedía de deficiencias en el modelo de teoría de la información, son criticados por tratarse solo de modelos culturales carentes de rigor científico. Espero que este libro engendre un vigoroso debate sobre qué clases de teoría sirven mejor a la ciencia de factores humanos. Si bien, las diez preguntas ofrecidas aquí forman una base para debate, existen más de diez respuestas posibles. Los libros posteriores en esta serie, continuarán buscando estas respuestas mediante la entrega de perspectivas prácticas y teóricas en los factores humanos en el transporte.
  • 7. Nota del Autor. Sidney Dekker es profesor de Factores Humanos en la Universidad Lund, Suecia. El recibió un M.A. en psicología organizacional de la University of Nijmegen y un M.A. en psicología experimental de la Leiden University, ambas en Noruega. El ganó su Ph.D. en Ingeniería de Sistemas Cognitivos de la Ohio State University. Ha trabajado previamente para el Public Transport Cooperation in Melbourne, Australia; la Massey University School of Aviation, Nueva Zelanda; y la British Aerospace. Sus especialidades e intereses investigativos son el error humano, investigación de accidentes, estudios de campo, diseño representativo y, automatización. Ha tenido alguna experiencia como piloto, entrenado en material DC-9 y Airbus A340. Sus libros previos incluyen The Field Guide to Human Error Investigations (2002).
  • 8. Capítulo 1. ¿Fue Falla Mecánica o Error Humano? Estos son tiempos excitantes y competitivos para factores humanos y seguridad operacional de sistemas. Y existen indicaciones sobre que no estaremos completamente bien equipados para ellos. Hay un reconocimiento creciente que los accidentes (un accidente de avión comercial, un desastre de un transbordador espacial) están intrincadamente ligados al funcionamiento de organizaciones e instituciones aledañas. La operación de aviones de aerolíneas comerciales o transbordadores espaciales o traslados de pasajeros, engendra vastas redes de organizaciones de apoyo, de mejoramiento y avance, de control y regulación. Tecnologías complejas no pueden existir sin estas organizaciones e instituciones – transportadores, reguladores, agencias de gobierno, fabricantes, subcontratistas, instalaciones de mantenimiento, grupos de entrenamiento – que, en principio, están diseñadas para proteger y dar seguridad a su operación. Su mandato real se orienta a no tener accidentes. Desde el accidente nuclear en Three Mile Island, en 1978, sin embargo, las personas se percatan en mayor medida que las mismas organizaciones destinadas a mantener una tecnología segura y estable (operadores humanos, reguladores, la administración, el mantenimiento), están en realidad entre los mayore contribuyentes al quiebre. Las fallas socio-tecnológicas son imposibles sin tales contribuciones. En desmedro de este reconocimiento creciente, factores humanos y seguridad operacional de sistemas dependen de un vocabulario basado en una concepción particular de las ciencias naturales, derivada de sus raíces en la ingeniería y en la psicología experimental. Este vocabulario, el uso sutil de metáforas, imágenes e ideas, está más y más de acuerdo con las demandas interpretativas puestas por los accidentes organizacionales modernos. El vocabulario expresa una visión mundial (tal vez), apropiada para las fallas técnicas, pero incapaz de abrazar y penetrar las áreas relevantes de fallas socio-técnicas – esas fallas que incorporan los efectos interconectados de la tecnología y de la complejidad social organizada circundando su uso. Lo que significa, más fallas hoy. Cualquier lenguaje, y la visión mundial que lo acompaña, imponen limitaciones en nuestro entendimiento de la falla. Sin embargo, estas limitaciones ahora están volviéndose incrementadamente evidentes y presionantes. Con el crecimiento en el tamaño y la complejidad del sistema, la naturaleza de los accidentes está cambiando (accidentes de sistemas, fallas sociotécnicas). La escasez y competitividad por recursos significa que los sistemas presionan incrementadamente sus operaciones hacia los bordes de sus coberturas de seguridad. Ellos tienen que hacerlo para permanecer exitosos en sus ambientes dinámicos. Los retornos comerciales al estar en los límites son mayores, pero las diferencias entre tener y no tener un accidente están caóticamente superando los márgenes disponibles. Los sistemas abiertos son remolcados continuamente hacia dentro de sus áreas de seguridad operacional, y los procesos que impulsan tal migración no son sencillos de reconocer o controlar, como tampoco la ubicación exacta de los márgenes. Los sistemas grandes, complejos, se ven capaces de adquirir una histéresis, una oscura voluntad propia, en la que derivan hacia mayor elasticidad o hacia los bordes de la falla. Al mismo tiempo, el veloz avance de los cambios tecnológicos crea nuevos tipos de peligros, especialmente aquellos que vienen con mayor dependencia en la tecnología computacional. Ambos sistemas, social e ingeniero (y su interrelación), se relacionan a un siempre mayor volumen de tecnología de información. A pesar de nuestra velocidad computacional y de que el acceso a la información pudiera parecer una ventaja de seguridad operacional en principio, nuestra habilidad de tomar conciencia de la información no está manteniendo el paso con nuestra habilidad para recolectarla y generarla. Al conocer más, puede que en realidad conozcamos mucho menos. Administrar la seguridad operacional en base a números (incidentes, conteos de error, amenazas a la seguridad operacional), como si la seguridad operacional fuera sólo otro indicador de un modelo de negocios de Harvard, puede crear una falsa impresión de racionalidad y control administrativo. Puede ignorar variables de orden más alto que pueden develarla verdadera naturaleza y dirección de la deriva del sistema. Podría venir además, al costo de comprensiones más profundas del funcionamiento socio-técnico real.
  • 9. DECONSTRUCCION, DUALISMO y ESTRUCTURALISMO. ¿Entonces qué es este idioma y la visión mundial técnica obsoleta que representa? Las características que lo definen son la deconstrucción, el dualismo y el estructuralismo. Deconstrucción significa que el funcionamiento de un sistema puede ser comprendido exhaustivamente al estudiar la distribución y la interacción de sus partes constituyentes. Científicos e ingenieros típicamente miran al mundo de esta forma. Las investigaciones de accidentes también deconstruyen. Para determinar la falla mecánica, o para ocasionar las partes dañinas, los investigadores de accidentes hablan de ―ingeniería reversa‖. Ellos recuperan partes de los restos y las reconstruyen en un todo nuevamente, a menudo literalmente. Pensemos en el TWA800 Boeing 747 que explotó en el aire luego del despegue desde el aeropuerto Kennedy de Nueva York, en 1998. Fue recuperado desde el fondo del Océano Atlántico y dolorosamente rearmado, en un hangar. Con el rompecabezas lo más completo como posible, las partes dañadas debieron eventualmente quedar expuestas, permitiendo a los investigadores identificar la fuente de la explosión. Pero continúa desafiando al sentido, continúa siendo un rompecabezas sólo cuando el funcionamiento (o no funcionamiento) de sus partes falla al explicar el todo. La parte que causó la explosión, que la inició, nunca fue identificada en verdad. Esto es lo que hace escalofriante la investigación del TWA800. En desmedro de una de las más caras reconstrucciones en la historia, las partes reconstruidas rechazaron contar por el comportamiento del todo. En un caso como tal, una comprensión atemorizante, incierta, da escalofríos a los cuerpos de investigación y a la industria. Un todo falló sin una parte fallada. Un accidente ocurrió sin una causa; no hay causa – nada que reparar, nada que reparar – y podría suceder mañana nuevamente, u hoy. La segunda característica definitoria es el dualismo. Dualismo significa que existe una separación distintiva entre causa humanan y material – entre el error humano y la falla mecánica --. Para ser un buen dualista usted, por supuesto, tiene que deconstruir: Usted tiene que desconectar las contribuciones humanas de las contribuciones mecánicas. Las reglas de la Organización de Aviación Civil Internacional, que gobierna a los investigadores de accidentes aéreos lo determinan expresamente. Ellos fuerzan a los investigadores de accidentes a separar las contribuciones humanas de las mecánicas. Parámetros específicos en los reportes de accidentes están reservados para el seguimiento de los componentes humanos potencialmente dañados. Los investigadores exploran el historial de las 24 y 72 horas previas de los humanos que más tarde se verían involucrados en un accidente. ¿Hubo alcohol? ¿Hubo estrés? ¿Hubo fatiga? ¿Hubo falta de eficiencia o experiencia? ¿Hubo problemas previos en los registros de entrenamiento u operacionales de estas personas? ¿Cuántas horas de vuelo tenía verdaderamente el piloto? ¿Hubo otras distracciones o problemas? Este requisito investigativo refleja una interpretación primitiva de los factores humanos, una tradición aeromédica en que el error humano está reducido a la noción de ―estar en forma para el servicio‖. Esta noción ha sido sobrepasada hace tiempo por los desarrollos en factores humanos hacia el estudio de personas normales realizando trabajos normales en lugares de trabajo normales (más que en individuos deficientes mental o fisiológicamente), pero el modelo aeromédico sobre extendido es retenido como una clase de práctica conformista positivista, dualista y deconstructiva. En el paradigma de estar en forma para el servicio, las fuentes de error humano debieron ser buscadas en las horas, días o años previos al accidente, cuando el componente humano estaba torcido, debilitado y listo para el quiebre. Encuentre la parte del humano que estaba perdida o deficiente, la ―parte desajustada‖, y la parte humana acarreará la carga interpretativa del accidente. Indague en la historia reciente, encuentre las piezas deficientes y arme el rompecabezas: deconstrucción, reconstrucción y dualismo. La tercera característica definitoria de la visión mundial técnica que aún gobierna nuestro entendimiento de éxito y fallas en sistemas complejos es el estructuralismo. El idioma que utilizamos para describir los trabajadores internos de sistemas de éxito y fallas es un idioma de estructuras. Hablamos de capas de defensa, de agujeros en estas capas. Identificamos los ―bordes suaves‖ y los ―bordes agudos‖ de organizaciones e intentamos capturar como una tiene efectos sobre la otra. Incluso la cultura de seguridad es tratada como una estructura edificada por otros bloques. Qué tanta cultura de seguridad tenga una organización depende de las partes y componentes que tenga para el reporte de incidentes (esto es mesurable), de hasta qué punto es justa con los operadores que cometen errores (esto es más difícil de medir, pero todavía posible), y de qué relación tiene entre sus funciones de seguridad y otras estructuras institucionales. Una realidad social profundamente compleja está por ende, reducida a un limitado número de componentes mesurables. Por ejemplo ¿tiene el departamento de seguridad una ruta directa a la administración más alta? ¿Cómo es esta tasa de reportes comparada a otras compañías? Nuestro idioma de fallas también es un idioma de mecánica. Describimos trayectorias de accidentes, buscamos causas y efectos, e interacciones. Buscamos fallas iniciadoras, o eventos gatilladores, y seguimos el colapso del sistema estilo dominó, que le sigue.
  • 10. Esta visión mundial ve a los sistemas socio-técnicos como máquinas con partes en una distribución particular (bordes agudos vs. suaves, capas de defensa), con interacciones particulares (trayectorias, efectos dominó, gatillos, iniciadores), y una mezcla de variables independientes o intervinientes (cultura de la culpa vs. cultura de seguridad). Esta es la visión mundial heredada de Descartes y Newton, la visión mundial que ha impulsado exitosamente el desarrollo tecnológico desde la revolución científica hace medio milenio. La visión mundial, y el idioma que produce, está basada en nociones particulares de ciencias naturales, y ejerce una sutil pero muy poderosa influencia en nuestra comprensión del éxito y falla sociotecnológicos hoy. Así como ocurre con mucha de la ciencia y pensamiento occidentales, perdura y dirige la orientación de factores humanos y seguridad de sistemas. Incluso el idioma, si se utiliza irreflexivamente, se vuelve fácilmente aprisionante. El idioma expresa, pero también determina qué podemos ver y cómo lo vemos. El idioma constriñe como construimos la realidad. Si nuestras metáforas nos dieran el coraje para modelar cadenas de accidentes, entonces comenzaremos nuestra investigación buscando eventos que encajen en esa cadena. ¿Pero qué eventos deben ir adentro? ¿Dónde debemos comenzar? Como Nancy Leveson (2002) señaló, la elección de cuales eventos poner dentro es arbitraria, así como la extensión, el punto de partida y el nivel de detalle de la cadena de eventos. ¿Qué, preguntó ella, justifica asumir que los eventos iniciales son mutuamente exclusivos, excepto que ello simplifica las matemáticas del modelo de la falla? Estos aspectos de la tecnología y de su operación, encumbran preguntas sobre lo apropiado del modelo dualista, deconstructivo, estructuralista que domina factores humanos y seguridad de sistemas. En su lugar, podríamos buscar una visión de sistemas real, que no solo apunte las deficiencias estructurales detrás de los errores humanos individuales (si se necesita ello lo puede hacer), pero que aprecia la adaptabilidad orgánica, ecológica, de sistemas sociotécnicos complejos. Buscando fallas para explicar fallas. Nuestras creencias y credos más arraigados a menudo permanecen encerrados en la más simple pregunta. La pregunta acerca de si el error humano o la falla mecánica es uno de ellos. ¿Fue el accidente causado por falla mecánica o por error humano? Es una pregunta existencial para las repercusiones posteriores de un accidente. Más aún, se ve como una pregunta tan simple e inocente. Para muchos es una consulta normal de preguntar: Si has tenido un accidente, tiene sentido averiguar que falló. La pregunta, sin embargo, envuelve una comprensión particular de cómo los accidentes ocurren, y sus riesgos confinando nuestro análisis causal de esa comprensión. Nos presenta en un repertorio interpretativo arreglado. Escapar de este repertorio puede ser difícil. Fija las preguntas que hacemos, entrega las cabezas que perseguimos y las claves que examinamos, y determina las conclusiones que eventualmente esbozaremos. ¿Qué componentes estaban dañados? ¿Fue algo maquinario o algo humano? ¿Por cuanto tiempo había estado torcido el componente o, de otra forma, deficiente? ¿Por qué se quebró eventualmente? ¿Cuáles fueron los factores latentes que conspiraron en su contra? ¿Qué defensas hubieron erosionado? Estos son los tipos de preguntas que dominan las investigaciones en factores humanos y seguridad de sistemas hoy en día. Organizamos reportes de accidentes y nuestro discurso sobre accidentes alrededor de la lucha por respuestas a ellos. Las investigaciones dan vuelta los componentes mecánicos dañados (un perno dañado en el trim vertical de un MD-80 de Alaska Airlines, azulejos refractantes de calor perforados en el transbordador espacial Columbia), componentes de baja performance humana (por ejemplo, quiebres en C.R.M., un piloto que tiene un accidentado historial de entrenamiento), y grietas en las organizaciones responsables por el rodaje del sistema (por ejemplo, cadenas de decisión organizacional débiles). El buscar fallas – humanas, mecánicas, u organizacionales – para explicar fallas es tan de sentido común que la mayoría de los investigadores nunca se detiene a pensar si estas son en realidad las pistas correctas que perseguir. Que la falla está causada por falla es pre racional – no la consideramos conscientemente más como una pregunta en las decisiones que hacemos acerca de dónde mirar y qué concluir. Aquí hay un ejemplo. Un bimotor Douglas DC-9-82 aterrizó en un aeropuerto regional en las Tierras Altas del Sur de Suecia en el verano de 1999. Chubascos de lluvia habían pasado a través del área más temprano, y la pista estaba aún húmeda. Durante la aproximación a la pista, la aeronave recibió un ligero viento de cola, y después del toque a tierra, la tripulación tuvo problemas disminuyendo la velocidad. A pesar de los esfuerzos de la tripulación por frenar, el jet recorrió la pista y terminó en un campo a unos pocos cientos de pies del umbral. Los 119 pasajeros y la tripulación abordo resultaron ilesos.
  • 11. Luego de su parálisis, uno de los pilotos salió de la aeronave para chequear los frenos. Estaban fríos. No había ocurrido ninguna acción de freno. ¿Cómo pudo haber ocurrido esto? Los investigadores no encontraron fallas mecánicas en la aeronave. Los sistemas de freno estaban bien. En vez de ello, a medida que la secuencia de eventos fue rebobinada en el tiempo, los investigadores se percataron que la tripulación no había armado los ground spoilers de la aeronave antes del aterrizaje. Los ground spoilers ayudan a un jet a frenar durante la carrera, pero requieren ser armados antes de que puedan hacer su trabajo. Armarlos es trabajo de los pilotos, y es un ítem de la lista de chequeo before-landing y parte de los procedimientos en que ambos miembros de la tripulación están envueltos. En este caso, los pilotos olvidaron armar los spoilers. ―Error de piloto‖, concluyó la investigación. O realmente, ellos lo llamaron ―Desmoronamientos en CRM (Crew Resource Management)‖ (Statens Haverikommision, 2000, p.12), una forma más moderna, más eufemista de decir ―error de piloto‖. Los pilotos no coordinaron lo que debían hacer; por alguna razón ellos fallaron en comunicar la configuración requerida de su aeronave. Además, después del aterrizaje, uno de los miembros de la tripulación no había dicho ―¡Spoilers!‖, como lo dicta el procedimiento. Esto pudo o debió alertar a la tripulación sobre la situación, pero ello no ocurrió. Los errores humanos habían sido encontrados. La investigación estaba concluida. ―Error humano‖ es nuestra elección por defecto cuando no encontramos fallas mecánicas. Es una elección forzada, inevitable, que se calza suficientemente bien en una ecuación, donde el error humano es el inverso al monto de la falla mecánica. La ecuación 1 muestra como determinamos la proporción de responsabilidad causal: Error humano = f (1 – falla mecánica) (1) Si no existe falla mecánica, entonces sabemos qué comenzar a buscar en reemplazo. En este caso, no hubo falla mecánica. La ecuación 1 viene como una función de 1 menos 0. La contribución humana fue 1. Fue error humano, un quiebre de CRM. Los investigadores encontraron que los dos pilotos a bordo del MD-80 eran ambos capitanes, y no un capitán y un copiloto, como es usual. Fue una simple coincidencia de una planificación no completamente inusual, un ajuste elástico volar a bordo de esa aeronave desde esa mañana. Con dos capitanes en un barco, las responsabilidades arriesgan ser divididas inestable e incoherentemente. La división de responsabilidades fácilmente conduce a su abdicación. Si es función del copiloto verificar que los spoilers estén armados, y no hay copiloto, el riesgo es obvio. La tripulación estaba en algún sentido ―desajustada‖, o a lo menos, propensa al desmoronamiento. Así fue (hubo un ―desmoronamiento de CRM‖). ¿Pero qué explica esto? Estos son procesos que ellos mismos requieren una explicación, y pueden ser guías que se enfríen de todas formas. Tal vez hay una realidad mucho más profunda, acechando entre las primeras acciones particulares de un accidente como tal, una realidad en donde las causas humanas y mecánicas están interconectadas de forma mucho más profunda, que en nuestros enfoques formulaicos que nos permiten comprender investigaciones. Para vislumbrar mejor esta realidad, primero tenemos que girar hacia el dualismo. Es el dualismo que descansa en el corazón de la elección entre error humano y falla mecánica. Echamos un breve vistazo a su pasado y lo confrontamos con el encuentro empírico inestable, incierto, de un caso de spoilers desarmados. La miseria del dualismo. La urgencia de separar la causa humana de la causa mecánica es algo que debe haber encrucijado incluso a los pioneros en factores humanos. Pensar en el enredo con las cabinas de la II Guerra Mundial, que tenían switches de control idénticos para una diversidad de funciones. ¿Pudieron evitar, una aleta tipo flap en el control del flap y un mecanismo con forma de rueda en el elevador del tren, la confusión típica entre ambos? En ambos casos, el sentido común y la experiencia dice ―sí‖. Al cambiar algo en el mundo, los ingenieros en factores humanos (suponiendo que ellos ya existían) cambiaron algo en el humano. Al jugar con el hardware con que las personas trabajaban, ellos cambiaron el potencial de las acciones correctas e incorrectas, pero sólo el potencial. Porque incluso con palancas de control con formas funcionales, algunos pilotos, en algunos casos, todavía las mezclaban. Al mismo tiempo, los pilotos no siempre mezclaban switches idénticos. Similarmente, no todas las tripulaciones que constan de dos capitanes fallan al armar los spoilers antes del aterrizaje. El error humano, en otras palabras, está suspendido, inestable, en algún lado entre las interfases humanas y mecánicas. El error no es completamente humano, ni completamente mecánico. Al mismo tiempo, ―fallas‖ mecánicas (proveer switches idénticos ubicados próximos uno del otro) tienen que expresarse ellos mismos en la acción humana. Así que, si ocurre una confusión entre flaps y tren, entonces ¿cuál es la causa? ¿Error humano o falla mecánica? Usted necesita ambos para tener éxito; necesita que ambos fallen. Donde termina uno y comienza el otro ya no está claro.
  • 12. Una idea del trabajo temprano en factores humanos era que el componente mecánico y la acción humana estaban interconectados en formas que resisten el desarreglo dualista, deconstruido, eficiente, preferido aún hoy por los investigadores (y sus consumidores). DUALISMO Y REVOLUCIÓN CIENTÍFICA. La elección entre causa humana y causa material no es un simple producto de la investigación de accidentes o la ingeniería en factores humanos recientes. La elección se encuentra firmemente arraigada a la visión mundial Newtoniana-Cartesiana que gobierna mucho de nuestro pensamiento hoy en día, particularmente en profesiones dominadas por la tecnología como la ingeniería de factores humanos y la investigación de accidentes. Isaac Newton y Rene Descartes fueron dos de las figuras cúspide en la Revolución Científica entre 1500 y 1700 D.C. quienes produjeron un cambio dramático en la visión mundial, así como también, cambios profundos en el conocimiento y en las ideas de cómo adquirir y probar el conocimiento. Descartes propuso una aguda distinción entre lo que llamó res cogitans, el dominio de la mente, y res extensa, el dominio del problema. Aunque Descartes admitió alguna interacción entre los dos, insistió que el fenómeno mental y físico no puede ser entendido haciendo referencia al otro. Los problemas que ocurren en cualquiera de los dominios requieren enfoques completamente separados y diferentes conceptos para resolverlos. La noción de mundos mentales y materiales separados llegó a ser conocida como dualismo y sus implicancias pueden ser reconocidas en mucho de lo que pensamos y hacemos hoy en día. De acuerdo a Descartes, la mente está fuera del orden físico del problema y en ninguna forma es derivada de él. La elección entre error humano y falla mecánica, es tal elección dualista: De acuerdo a la lógica Cartesiana, el error humano no puede derivar de cosas materiales. Como veremos, esta lógica no se sustenta bien – de hecho, en una inspección más cercana, todo el campo de factores humanos está basado en esta afirmación. Separar el cuerpo del alma, y subordinar el cuerpo al alma, no sólo mantuvo a Descartes guerra de problemas con la Iglesia. Su dualismo, su división entre mente y problema, agregó un importante problema filosófico que tuvo el potencial de sostener el progreso científico, tecnológico y social: ¿Cuál es el nexo entre mente y problema, entre el alma y el mundo material? ¿Cómo podríamos, como humanos, tomar el control y rehacer nuestro mundo físico lo suficiente para que estuviera aleado indivisiblemente, o incluso fuera sinónimo, con un alma irreductible, eterna? Una de las mayores aspiraciones durante la Revolución Científica de los siglos XVI y XVII fue el ver y comprender (y llegar a tener la capacidad de manipular) el mundo material como una máquina controlable, predictible, programable. Esto lo requirió para ser visto como nada más que una máquina: Sin vida, sin espíritu, sin alma, sin eternidad, sin inmaterialismo, sin impredictibilidad. La res extensa de Descartes, o mundo material, respondió sólo a esa inquietud. La res extensa fue descrita según el trabajar como una máquina, seguir reglas mecánicas y permitir explicaciones en términos de arreglo y movimiento de sus partes constituyentes. El progreso científico llegó a ser más fácil a causa de lo que excluyó. Lo que requirió la Revolución Científica, fue provisto por la desunión de Descartes. La naturaleza se volvió una máquina perfecta, gobernada por leyes matemáticas que fueron aumentando dentro de la comprensión del entendimiento y control humanos, y lejos de las cosas que los seres humanos no pueden controlar. Newton, por supuesto, es el padre de muchas de las leyes que aún gobiernan nuestro entendimiento y universo hoy en día. Su tercera ley de movimiento, por ejemplo, descansa en las bases de nuestras presunciones sobre causa y efecto, y causas de accidentes: Para cada acción existe una reacción igual y opuesta. En otras palabras, para cada causa existe un efecto equivalente, o más bien, para cada efecto, tiene que haber una causa equivalente. Una ley como tal, si bien sea aplicable a la liberación y transferencia de energía en sistemas mecánicos, está erróneamente enfocada al ser aplicada a fallas sociotécnicas, cuando las pequeñas banalidades y sutilezas del trabajo normal hecho por gente normal en organizaciones normales puede degenerar lentamente en desastres enormes, en liberaciones de energía desproporcionadamente altas. La equivalencia de causa-consecuencia dictada por la tercera ley del movimiento de Newton, es bastante inapropiada como modelo de accidentes organizacionales. Adquirir control sobre un mundo material fue de crítica importancia para las personas hace quinientos años. La tierra de inspiración y fertilidad para las ideas de Descartes y Newton puede ser entenderse en el contraste de su tiempo. Europa estaba emergiendo de la Edad Media –tiempos de temor y fe, donde los lapsos de vida eran segados tempranamente por guerras, enfermedad y epidemias. No deberíamos subestimar la ansiedad y aprensión sobre la habilidad humana de enfocar sus esfuerzos contra estas míticas profecías. Luego de la Plaga, a los habitantes de la Inglaterra nativa de Newton, por ejemplo, les tomó hasta 1650 recuperar el nivel de 1300. La gente estaba a merced de fuerzas no apenas controlables y comprendidas como enfermedades.
  • 13. En el milenio precedente, la piedad, la oración y la penitencia estaban entre los mecanismos directivos mediante los cuales la gente podía alcanzar alguna clase de dominio sobre el mal y el desastre. El crecimiento de la perspicacia producido por la Revolución Científica, lentamente comenzó a entregar una alternativa, con éxito mensurable empíricamente. La Revolución Científica entregó nuevos medios para controlar el mundo natural. Los telescopios y microscopios le dieron a la gente nuevas formas de estudiar componentes que hasta entonces habían sido muy pequeños o habían estado muy distantes para ser vistos por el ojo desnudo, abriendo de pronto una visión del universo completamente nueva y por primera vez, revelando causas de los fenómenos hasta entonces malamente comprendidos. La naturaleza no fue un monolito atemorizante, inexpugnable, y las personas dejaron de estar sólo en el final de sus victimarios caprichos. Al estudiarla de nuevas formas, con nuevos instrumentos, la naturaleza podría ser descompuesta, partida en trozos más pequeños, medida y, a través de todo eso, comprendida mejor y eventualmente controlada. Los avances en las matemáticas (geometría, álgebra, cálculo), generaron modelos que pudieron contar para y predecir fenómenos recientemente descubiertos en, por ejemplo, medicina y astronomía. Al descubrir algunos de los cimientos del universo y la vida, y al desarrollar matemáticas que imitan su funcionamiento, la Revolución Científica reintrodujo un sentido de predictibilidad y control que hacía tiempo yacía durmiendo durante la Edad Media. Los seres humanos pudieron alcanzar el dominio y la preeminencia sobre las vicisitudes e imprevisibilidades de la naturaleza. La ruta hacia tal progreso debería venir de medir, derribar (conocido variadamente hoy como reducir, descomponer o deconstruir) y modelar matemáticamente el mundo a nuestro alrededor – para seguidamente reconstruirlo en nuestros términos. La mesurabilidad y el control son temas que animaron a la Revolución Científica, y resuenan fuertemente hoy en día. Incluso las nociones de dualismo (los mundos material y mental se encuentran separados) y la deconstrucción (los ―todos‖ pueden ser explicados por el arreglo y la interacción de sus partes constituyentes a bajo nivel) han sobrevivido largamente a sus iniciadores. La influencia de Descartes se juzga tan grande en parte debido a que él la escribió en su lengua nativa, más que en latín, presumiéndose por lo tanto que amplió el acceso y la exposición popular a sus pensamientos. La mecanización de la naturaleza desparramada por su dualismo, y los enormes avances matemáticos de Newton y otros, lideraron siglos de progreso científico sin precedentes, crecimiento económico y éxito de ingeniería. Como señalara Fritjof Capra (1982), la NASA no habría tenido la posibilidad de poner un hombre en la Luna sin Rene Descartes. La herencia, sin embargo, es definitivamente una bendición mezclada. Los Factores Humanos y la Seguridad de Sistemas están estancados con un lenguaje, con metáforas e imágenes que enfatizan la estructura, componentes, mecánicas, partes e interacciones, causa y efecto. Mientras nos dan la dirección inicial para construir sistemas seguros y para figurarnos lo que estuvo mal, cuando cambia no lo hacemos nosotros, hay límites para la utilidad de este vocabulario heredado. Regresemos a ese día de verano de 1999 y a la carrera en pista del MD-80. En buena tradición Newtoniana-Cartesiana, podemos comenzar abriendo el avión un poco más, separar los diversos componentes y procedimientos para ver como interactúan, segundo a segundo. Inicialmente seremos alcanzados por el éxito empíricamente resonante – como de hecho Descartes y Newton frecuentemente fueron. Pero cuando queremos recrear el todo en base a las partes que encontramos, una realidad más problemática salta a la vista: Ya no todo va bien. La exacta, matemáticamente placentera separación entre causa humana y mecánica, entre episodios sociales y estructurales, se ha derribado. El todo ya no se ve más como una función linear de la suma de sus partes. Como explicara Scout Snook (2000), los dos pasos clásicos occidentales de reducción analítica (el todo en partes) y síntesis inductiva (las partes de vuelta en el todo nuevamente) parecen funcionar, pero simplemente juntando las partes que encontramos no captura la rica complejidad oculta dentro y alrededor del incidente. Lo que se necesita es una integración orgánica, holística. Tal vez sea necesaria una nueva forma de análisis y síntesis, sensible a la situación total de la actividad sociotécnica organizada. Pero primero examinemos la historia analítica, componencial. SPOILERS, PROCEDIMIENTOS Y SISTEMAS HIDRÁULICOS Los spoilers son esos flaps que se levantan al flujo de aire en la parte superior de las alas, luego que la aeronave ha tocado tierra. No solo contribuyen a frenar la aeronave al obstruir la corriente de aire, sino que además, causan que el ala pierda la capacidad de crear sustentación, forzando el peso de la aeronave en las ruedas. La extensión de los ground spoilers acciona además el sistema de frenado automático en las ruedas. Mientras más peso llevan las ruedas, más efectiva se vuelve su frenado. Antes de aterrizar, los pilotos seleccionan el ajuste que desean en el sistema de frenado de ruedas automático (mínimo, medio o máximo), dependiendo del largo y condiciones de la pista.
  • 14. Luego del aterrizaje, el sistema automático de frenado de ruedas disminuirá la velocidad de la aeronave sin que el piloto tenga que hacer algo, y sin dejar que las ruedas deslicen o pierdan tracción. Como tercer mecanismo para disminuir la velocidad, la mayoría de los aviones jet tiene reversores de impulso, que direccional el flujo saliente de los motores jet en contra de la corriente de aire, en vez de hacerlo salir hacia atrás. En este caso, no salieron los spoilers, y como consecuencia, no se accionó el sistema de frenado automático de ruedas. Al correr por la pista, los pilotos verificaron el ajuste del sistema de frenado automático en múltiples oportunidades, para asegurarse que se encontraba armado e incluso cambiando su ajuste a máximo, al ver acercarse el final de la pista. Pero nunca engancharía. El único mecanismo remanente para disminuir la velocidad de la aeronave era el empuje reverso. Los reversores, sin embargo, son más efectivos a altas velocidades. Para el momento en que los pilotos se percataron que no iban a lograrlo antes del final de la pista, la velocidad era ya bastante baja (ellos terminaron saliendo al campo a 10-20 nudos) y los reversores no tenían entonces un efecto inmediato. A medida que el jet salía por el borde de la pista, el capitán cerraba los reversores y desplazaba la aeronave algo a la derecha para evitar obstáculos. ¿Cómo se arman los spoilers? En el pedestal central, entre los dos pilotos, hay una cantidad de palancas. Algunas son para los motores y reversores de impulso, una es para los flaps, y una para los spoilers. Para armar los ground spoilers, uno de los pilotos debe elevar la palanca. La palanca sube aproximadamente una pulgada y permanece allí, armada hasta el toque a tierra. Cuando el sistema censa que la aeronave está en tierra (lo que hace en parte mediante switches en el tren de aterrizaje), la palanca regresa automáticamente y los spoilers salen. Asaf Degani, quien estudió tales problemas procedimentales en forma extensa, ha llamado el episodio del spoiler no como uno de error humano, sino uno de tiempo (timing) (ejemplo, Degani, Heymann & Shafto, 1999). En esta aeronave, como en muchas otras, los spoilers no deberían ser armados antes que se haya seleccionado el tren de aterrizaje abajo y se encuentre completamente en posición. Esto tiene que ver con los switches que pueden indicar cuando la aeronave se encuentra en tierra. Estos son switches que se comprimen a medida que el peso de la aeronave se asienta en las ruedas, pero no sólo en esas circunstancias. Existe un riesgo en este tipo de aeronave, que el switch en el tren de nariz se comprima incluso mientras el tren de aterrizaje está saliendo de su asentamiento. Ello puede ocurrir debido a que el tren de nariz se despliega en la corriente de aire de impacto. A medida que el tren de aterrizaje está saliendo y la aeronave se desliza en el aire a 180 nudos, la pura fuerza del viento puede comprimir el tren de nariz, activar el switch y seguidamente arriesgar extendiendo los ground spoilers (si se encontrasen armados). No es una buena idea: La aeronave podría tener problemas volando con los ground spoilers fuera. Por lo tanto, el requerimiento: El tren de aterrizaje necesita estar durante todo su recorrido hacia fuera, apuntando abajo. Sólo cuando no exista más riesgo de compresión del switch aerodinámica, los spoilers pueden ser armados. Este es el orden de los procedimientos before-landing: Gear down and locked. Spoilers armed. Flaps FULL. En una aproximación típica, los pilotos seleccionan abajo la manivela del tren de aterrizaje cuando el llamado glide slope se encuentra vivo: cuando la aeronave ha entrado en el rango de la señal electrónica que la guiará hacia abajo a la pista. Una vez que el tren de aterrizaje se encuentra abajo, los spoilers deben ser armados. Entonces, una vez que la aeronave captura ese glide slope (por ejemplo, está exactamente en la marcación electrónica) y comienza a descender en la aproximación a la pista, los flaps necesitan ser ajustados a FULL (típicamente 40º). Los flaps son otros aparatos que se extienden desde el ala, cambiando su forma y tamaño. Ellos permiten a la aeronave volar más lento para un aterrizaje. Esto condiciona los procedimientos al contexto. Ahora se ve así: Gear down and locked (cuando el glide slope esté vivo). Spoilers armed (cuando el tren esté abajo y asegurado). Flaps FULL (cuando el glide slope esté capturado). ¿Pero cuánto toma pasar desde ―glide slope vivo‖ a ―glide slope capturado‖? En una típica aproximación (dada la velocidad) esto toma alrededor de 15 segundos. En un simulador, donde toma lugar el entrenamiento, esto no crea problema. El ciclo completo (desde la palanca del tren abajo hasta la indicación ―gear down and locked‖ en la cabina), toma alrededor de 10 segundos.
  • 15. Eso deja 5 segundos para armar los spoilers, antes que la tripulación necesite seleccionar flaps FULL (el ítem siguiente en los procedimientos). En el simulador, entonces, las cosas se ven como esto: En t = 0 Gear down and locked (cuando el glide slope esté vivo). En t + 10 Spoilers armed (cuando el tren esté abajo y asegurado). En t + 15 Flaps FULL (cuando el glide slope esté capturado). Pero en una aeronave real, el sistema hidráulico (que, entre otras cosas, extiende el tren de aterrizaje), no es tan efectivo como en un simulador. El simulador, desde luego, solo simula los sistemas hidráulicos de la aeronave, modelado en como se encuentra la aeronave cuando tiene cero horas voladas, cuando está reluciente de nuevo, salido de fábrica. En una aeronave más vieja, puede tomar hasta medio minuto al tren realizar el ciclo y quedar asegurado. Ello hace que los procedimientos se vean algo así: En t = 0 Gear down and locked (cuando el glide slope esté vivo). En t + 30 Spoilers armed (cuando el tren esté abajo y asegurado). ¡Pero! en t + 15 Flaps FULL (cuando el glide slope esté capturado). En efecto, entonces, el ítem ―flaps‖ en los procedimientos molesta antes del ítem ―spoilers‖. Una vez que el ítem ―flaps‖ está completo y la aeronave desciende hacia la pista, es fácil continuar con los procedimientos desde allí, con los ítems siguientes. Los spoilers nunca se arman. Su armado ha caído entre los quiebres de una combadura de tiempo. Una exclusiva declaración de error humano (o quiebre de CRM), se vuelve más difícil de sostener frente a este trasfondo. ¿Qué tanto error humano hubo, en verdad? Permanezcamos dualistas por ahora y volvamos a visitar la Ecuación 1. Ahora apliquemos una definición más liberal de falla mecánica. El tren de nariz de la aeronave real, ajustado con un switch de compresión, está diseñado de forma tal que se pueda desplegar en el viento mientras se vuela. Esto introduce una vulnerabilidad mecánica sistemática que solamente es tolerada mediante pausas procedimentales (un mecanismo de conocidos agujeros contra la falla): primero el tren, luego los spoilers. En otras palabras, ―gear down and locked‖ es un prerrequisito mecánico para el armado de los spoilers, pero el ciclo completo del tren puede tomar más tiempo del figurado en los procedimientos y las pausas de eventos que dirigen su aplicación. El sistema hidráulico de los viejos jets no presuriza tan bien: Puede tomar hasta 30 segundos para un tren de aterrizaje realizar el ciclo hacia fuera. El simulador de vuelo, en contraste, realiza el mismo trabajo dentro de 10 segundos, dejando una sutil pero sustantiva incongruencia. Una secuencia de trabajo es introducida y practicada durante el, mientras que una delicadamente diferente es necesaria para las operaciones reales. Más aún, esta aeronave tiene un sistema que advierte si los spoilers no están armados en el despegue, pero no tiene un sistema para advertir que los spoilers no están armados en la aproximación. Entonces ahí está el arreglo mecánico en el cockpit. La palanca de spoiler armado luce diferente de la de spoiler no armado sólo por una pulgada y un pequeño cuadrado rojo en el fondo. Desde la posición del piloto en el asiendo derecho (quien necesita confirmar su armado), este parche rojo se oscurece detrás de las palancas de potencia mientras estas se encuentran en la posición típica de aproximación. Con tanta contribución mecánica alrededor (diseño del tren de aterrizaje, sistema hidráulico erosionado, diferencias entre el simulador y la aeronave real, distribución de las palancas del cockpit, falta de un sistema de advertencia de los spoilers durante la aproximación, pausas en los procedimientos) y una contribución de planificación estocástica (dos capitanes en este vuelo), una falla mecánica de mucho mayor magnitud podría ser adherida a la ecuación para rebalancear la contribución humana. Pero eso todavía es dualista. AL reensamblar las partes que encontramos entre procedimientos, pausas, erosión mecánica, intercambios de diseño, podemos comenzar a preguntar donde realmente terminan las contribuciones mecánicas, y donde comienzan las contribuciones humanas. La frontera ya no está tan clara. La carga impuesta por un viento de 180 nudos en la rueda de nariz se transfiere a un débil procedimiento: primero el tren, luego los spoilers. La rueda de nariz, desplegándose al viento y equipada con un switch de compresión, es incapaz de acarrear esa carga y garantizar que los spoilers no se extenderán, por lo que en su lugar, un procedimiento tiene que llevar la carga. La palanca del spoiler está ubicada en una forma que hace difícil su verificación, y un sistema de advertencia para spoilers no armados no se encuentra instalado. Nuevamente, el error está suspendido, inestable, entre la intención humana y el hardware de ingeniería – pertenece a ambos y a ninguno únicamente. Y entonces está esto: El desgaste gradual de un sistema hidráulico no es algo que haya sido tomado en cuenta durante la certificación del jet. Un MD-80 con un sistema hidráulico anémico que toma más de medio minuto para llevar todo el tren fuera, abajo y asegurado, violando el requerimiento de diseño original por un factor de tres, aún se considera aeronavegable.
  • 16. El sistema hidráulico desgastado no puede ser considerado una falla mecánica. No deja al jet en tierra. Ni tampoco lo hace la palanca del spoiler de difícil verificación, ni la falta de un sistema de advertencia durante la aproximación. El jet fue certificado como aeronavegable con o sin todo ello. Que no haya falla mecánica, en otras palabras, no es porque no existan asuntos mecánicos. No existe falla mecánica por los sistemas sociales, hechos por los fabricantes, reguladores, y operadores prospectivos – formados indudablemente por preocupaciones prácticas y expresados a través de juicio de ingeniería situado con incertidumbre sobre el desgaste futuro – decidieron que ahí no podría haber ninguno (al menos no relacionado con los asuntos ahora identificados en la corrida de un MD-80). ¿Dónde termina la falla mecánica y comienza el error humano? Al excavar sólo lo suficientemente profundo, la pregunta se vuelve imposible de responder. RES EXTENSA Y RES COGITANS, ANTIGUO Y NUEVO Separar res extensa de res cogitans, como hizo Descartes, es artificial. No es el resultado de procesos o condiciones naturales, sino más bien una imposición de una visión global. Esta visión global, debido inicialmente al progreso científico acelerante, está comenzando a estorbar seriamente en nuestro entendimiento. En los accidentes modernos, las causas mecánicas y humanas están desenfocadas. La disyunción entre mundos materiales y mentales, y el requerimiento para describirlas diferente y separadamente, están debilitando nuestros esfuerzos para comprender el éxito y la falla sociotécnicos. La distinción entre las visiones nueva y antigua del error humano, la que también fue hecha antes en ―Field Guide to Human Error Investigations” (Dekker, 2002) realmente se conduce ásperamente sobre esas sutilezas. Revisemos cómo la investigación en el incidente de la corrida sobre pista encontró ―quiebres en CRM‖ como un factor causal. Este es el pensamiento de la visión antigua. Alguien, en este caso un piloto, o más bien una tripulación de dos pilotos, olvidó armar los spoilers. Este fue un error humano, una omisión. Si ellos no hubiesen olvidado armar los spoilers, el accidente no habría ocurrido, fin de la historia. Pero tal análisis de la falla no se prueba bajo las variables superficiales inmediatamente visibles de una secuencia de eventos. Como Perrow (1984) señaló, sólo juzga dónde la gente debió hacer ―zig‖, en vez de hacer ―zag‖. La vieja visión del error humano es sorprendentemente común. En la visión antigua, el error – por cualquiera de sus nombres (ej.: complacencia, omisión, quiebre de CRM) – es aceptado como una explicación satisfactoria. Esto es lo que la nueva visión del error humano trata de evitar. Ella ve al error humano como una consecuencia, como el resultado de fallas y problemas más profundos dentro de los sistemas en que las personas trabajan. Se resiste a ver al error humano como la causa. Por sobre juzgar a las personas por no hacer lo que debieron hacer, la nueva visión presenta herramientas para explicar por qué las personas hicieron lo que hicieron. El error humano se vuelve un punto de partida, no una conclusión. En el caso ―spoiler‖, el error es un resultado de intercambios de diseño, erosión mecánica, vulnerabilidades de los procedimientos y estocástica operacional. Por cierto, el compromiso de la nueva visión es resistir las ordenadas y condensadas versiones en las que la elección humana o una parte mecánica fallada, guió a la estructura completa en el camino a la perdición. La distinción entre las visiones antigua y nueva es importante y necesaria. Sin embargo, incluso en la nueva visión el error es todavía un efecto, y los efectos son el lenguaje de Newton. La nueva visión implícitamente concuerda con la existencia, la realidad del error. Se ve al error como algo que está allá afuera, en el mundo, y causado por algo más, también allá fuera, en el mundo. Como muestran los capítulos siguientes, tal (ingenua) posición realista es tal vez insostenible. Volvamos a cómo el universo Newtoniano-Cartesiano consta de ―todos‖ que pueden ser explicados y controlados al quebrarlos en partes constituyentes y sus interconexiones (por ejemplo, humanos y máquinas, bordes suaves y bordes agudos, culturas de seguridad y culturas de culpa). Los sistemas están hechos de componentes, y de lazos de tipo mecánico entre esos componentes. Esto descansa en la fuente de la elección entre causa humana y causa material (¿es error humano o falla mecánica?). Es Newtoniano en que se busque una causa para cualquier efecto observado, y Cartesiano en su dualismo. De hecho, ello expresa tanto el dualismo de Descartes (ya sea mental o material: Usted no puede mezclar los dos) y la noción de descomposición, donde las propiedades e interacciones de orden bajo determinan completamente a todos los fenómenos. Ellos son suficientes; usted no necesita más. El analizar qué bloques constituyentes van hacia el problema, y como ellos se suman, es necesario y suficiente para comprender por qué ocurren los problemas. La ecuación 1 es un reflejo de la suficiencia explicatoria asumida por las propiedades de orden bajo. Agregue las contribuciones individuales, y se desenrollará la respuesta a por qué ocurrió el problema. Una corrida de una aeronave sobre la pista hoy puede ser entendida al partir las contribuciones en causas humanas y mecánicas, analizando las propiedades e interacciones de cada una de ellas y entonces reensamblándolas de vuelta en un ―todo‖.
  • 17. ―Error humano‖ aparece como la respuesta. Si no existen contribuciones materiales, se espera que la contribución humana acarree la carga explicatorio completa. Tan pronto como se alcance el progreso utilizando esta visión mundial, no hay razón para cuestionarla. En varias esquinas de la ciencia, incluyendo factores humanos, muchas personas aún no ven razón para hacerlo. De hecho, no hay razón para que los modelos estructuralistas no puedan ser impuestos en el desordenado interior de los sistemas sociotécnicos. Que estos sistemas, sin embargo, revelen propiedades tipo máquinas (componentes e interconexiones, capas y agujeros) cuando los abrimos ―post mortem‖ no significa que ellos sean máquinas, o que ellos, en vida, hayan crecido y se hayan comportado como máquinas. Como Leveson (2002) señaló, la reducción analítica asume que la separación de un todo en partes constituyentes es practicable, que los subsistemas operen independientemente, y que los resultados de análisis no se distorsionen por separar el todo en partes. Esto, en cambio, implica que los componentes no están sujetos a loops de retroalimentación y otras interacciones no lineares, y que ellas son esencialmente las mismas al ser examinadas en forma singular o al formar parte en el todo. Mas aún, esto asume que los principios que gobiernan el ensamble de los componentes en el todo van directo hacia delante; las interacciones entre componentes son simplemente suficientes para ser consideradas separadas del comportamiento del todo. ¿Son válidas estas presunciones cuando tratamos de comprender los accidentes sistémicos? Los próximos capítulos nos dan que pensar. Tomemos por ejemplo el reto de la deriva hacia la falla y la naturaleza evasiva de los accidentes que ocurren sobre un nivel de seguridad de 10 -7 . Esos accidentes no ocurren debido sólo a causa de fallas de sus componentes, sin embargo, nuestros modelos mecánicos del funcionamiento organizacional o humano nunca pueden capturar los procesos orgánicos, relacionales que gradualmente empujan a un sistema sociotécnico hacia el borde del quiebre. Al observar las fallas de los componentes, tales como los ―errores humanos‖, que muchos métodos de categorización popular buscan, pueden ser fraudulentas en su ilusión acerca de lo que nos dicen sobre la seguridad y el riesgo en sistemas complejos. Hay un consenso creciente sobre que nuestros esfuerzos y modelos en vigencia serán incapaces de romper con la asíntota, el fuera de nivel, en nuestro progreso en la seguridad más allá de 10 -7 . ¿Es la visión de sistemas sociotécnicos estructuralista y mecánica, donde vemos componentes y lazos y sus fallas, apropiada aún para hacer un real progreso? Capítulo 2 ¿Por qué fallan los sistemas seguros? Los accidentes en realidad no ocurren muy a menudo. La mayoría de los sistemas de transporte en el mundo desarrollado son seguros, o incluso ultra seguros. La usanza de un accidente fatal es menor a 10 -7 , lo que significa una posibilidad de muerte, pérdida seria de propiedad, o devastación económica o medioambiental de uno a 10.000.000 (Amalberti, 2001). Al mismo tiempo, esto aparece como una frontera mágica. Ningún sistema de transporte ha encontrado una forma de ser aún más seguro. El progreso en la seguridad más allá de 10 -7 es evasivo. Como ha señalado Rene Amalberti, las extensiones lineares de los esfuerzos en materias de seguridad vigentes (reporte de incidentes, gestión de seguridad y calidad, verificación de eficiencia, estandarización y procedimentalización, más reglas y regulaciones) se ven como de pequeño uso para quebrar la asíntota, incluso si son necesarios para mantener el nivel de seguridad de 10 -7 . Aún más intrigante, los accidentes que ocurren en esta frontera parecen ser de un tipo difícil de predecir utilizando la lógica que gobierna el pensamiento de seguridad hasta 10 -7 . Es aquí que las limitaciones de un vocabulario estructuralista se vuelven más aparentes. Los modelos de accidente que se relacionan ampliamente en fallas, agujeros, violaciones y deficiencias, pueden tener un momento difícil al acomodar accidentes que parecen surgir de (lo que parece que a todos les gusta) gente normal, realizando trabajo normal en organizaciones normales. Sin embargo, el misterio es que en las horas, días, o incluso años previos a un accidente más allá de 10 -7 , pudo haber algo de reporte de fallas o notoriedad de las deficiencias organizacionales. Los reguladores, así como también el personal interno, típicamente no ven personas violando reglas, ni tampoco descubren otras fallas que pudieran dar causa para finalizar o reconsiderar seriamente las operaciones. Si sólo fuera así de simple. Y hasta es probable en 10 -7 . Pero cuando las fallas, fallas serias, ya no están precedidas de fallas serias, la predicción de accidentes se vuelve mucho más difícil. Y modelarlos con la ayuda de nociones mecánicas, estructuralistas puede ser de poca ayuda. El mayor riesgo residual en los sistemas sociotécnicos seguros de hoy en día es la deriva hacia la falla. La deriva hacia la falla es como un movimiento lento e incremental de las operaciones de sistemas hacia el borde de sus envoltorios de seguridad. Las presiones de escasez y competencia típicamente alimentan esta deriva.
  • 18. Tecnología no acertada y conocimiento incompleto sobre dónde están realmente los límites, resulta en que las personas no detengan la deriva o incluso no la perciban. El accidente del Alaska Airlines 261 del año 2000 es altamente instructivo en este sentido. El MD-80 se estrelló en el océano fuera de California luego que se rompiera el sistema de trim en su cola. En la superficie, el accidente parece ajustarse a la simple categoría que ha venido a dominar las estadísticas recientes de accidentes: fallas mecánicas como resultado de mantenimiento pobre: Un componente particular falló a causa de que las personas no lo mantuvieron adecuadamente. De hecho, hubo una falla catastrófica de un componente particular. Una falla mecánica, en otras palabras. El quiebre volvió incontrolable a la aeronave inmediatamente, y la envió en picada hacia el Pacífico. Pero tales accidentes no ocurren sólo porque alguien súbitamente erra o algo súbitamente se quiebra: Se supone que hay demasiada protección construida contra los efectos de fallas particulares. ¿Qué tal si estas estructuras protectoras contribuyan en sí mismas a la deriva, de algunas formas inadvertida, desapercibida y difícil de detectar? ¿Que tal si la complejidad social organizada que rodea a la operación tecnológica, todos los comités de mantenimiento, grupos de trabajo, intervenciones regulatorias, aprobaciones e inputs del fabricante, que se supone deben proteger al sistema del quiebre, en realidad hayan contribuido a fijar su curso hacia el borde del envoltorio? Desde ―Man-Made Disasters‖, de Barry Turner, 1978, sabemos explícitamente que los accidentes están incubados en sistemas complejos y bien protegidos. El potencial para un accidente se acumula en el tiempo, pero esta acumulación, esta firme deslizada hacia el desastre, generalmente pasa irreconocida por aquellos en el interior, e incluso por aquellos en el exterior. Así que, el Alaska 261 no sólo es una falla mecánica, incluso si eso es lo que mucha gente pudiera querer ver como eventual resultado (y causa aproximada del accidente). El Alaska 261 es un poco de desacierto tecnológico, algo de adaptaciones graduales y algo de deriva hacia la falla. Es acerca de las influencias mutuas e inseparables entre el mundo mecánico y social, y deja completamente expuesto lo inadecuado de nuestros modelos vigentes en factores humanos y seguridad de sistemas. PERNOS Y JUICIOS DE MANTENIMIENTO. En el Alaska 261, la deriva hacia el accidente que ocurrió el 2000, había comenzado décadas antes. Se remonta hasta los primeros vuelos del Douglas DC-9, que precedió al tipo MD-80. Como (casi) todas las aeronaves, este tipo tiene un estabilizador horizontal (o plano de cola, una pequeña ala) en la parte posterior, que contribuye a dirigir la sustentación creada por las alas. Este pequeño plano de cola es el que mantiene arriba la nariz de una aeronave: Sin el, no es posible el vuelo controlado (ver Fig. 2.1). El plano de cola en sí mismo, puede angular hacia arriba o abajo para inclinar la nariz hacia arriba o hacia abajo (y consecuentemente, hacer que al avión vaya hacia arriba o hacia abajo). En la mayoría de las aeronaves, el sistema de compensación puede ser dirigido por el piloto automático y por inputs de la tripulación. El plano de cola está engoznado en la parte posterior, mientras que el extremo delantero se arquea hacia arriba o hacia abajo (también tiene superficies de control en la parte posterior que están conectadas a la columna de control en el cockpit, pero ellas no están tratadas aquí). Fig. 2.1. Ubicación del estabilizador horizontal en un avión tipo MD-80. Estabilizador Horizontal
  • 19. La presión en el extremo frontal del estabilizador horizontal arriba o abajo, es ejercida mediante un perno giratorio y una tuerca. La estructura completa trabaja un poco como una gata de automóvil utilizada para levantar un vehículo: Por ejemplo, al cambiar un neumático. Usted gira y el perno rota, empujando hacia adentro las tuercas de la cima, por así llamarlas, y levantando el auto (ver Fig. 2.2). En el sistema del trim del MD-80, el extremo frontal del estabilizador horizontal está conectado a una tuerca que dirige un perno vertical hacia arriba y hacia abajo. Un motor de trim eléctrico rota el perno, el que a su vez sube o baja la tuerca. La tuerca entonces, empuja al estabilizador horizontal completo hacia arriba o hacia abajo. La lubricación adecuada es crítica para el funcionamiento de un montaje de perno y tuerca. Sin grasa suficiente, el roce constante destruirá el hilo en la tuerca o bien en el perno (en este caso, el perno está hecho deliberadamente de un material más duro, dañándose primero el hilo de la tuerca). El hilo en realidad lleva la carga completa que se impone en el timón vertical durante el vuelo. Es una carga de alrededor de 5000 libras, similar al peso de una van familiar completa colgando por el hilo de un montaje de perno y tuerca. Donde se salga el hilo en un MD-80, la tuerca no podrá atrapar los hilos del perno. Las fuerzas aerodinámicas entonces, forzarán al plano horizontal de cola (y la tuerca) a su detención fuera del rango normal, volviendo a la aeronave incontrolable en el eje de pitch, que es esencialmente lo que le ocurrió al Alaska 261. Incluso la detención falló a causa de la presión. Una suerte de tubo de torque corrió a través del perno para generar redundancia FIG. 2.2. Trabajo simplificado del mecanismo de trim de la cola de un avión tipo MD-80. El estabilizador horizontal está engoznado en le parte posterior y conectado al perno mediante la tuerca. El estabilizador es desplazado arriba y abajo por rotación del perno. (en vez de tener dos pernos, como en el modelo DC-8, que lo precedía). Pero incluso el tubo de torque falló en el Alaska 261. Desde luego, se suponía que nada de esto debía ocurrir. En el primer lanzamiento de la aeronave, a mediados de 1960, Douglas recomendó que los operadores lubricaran el perno del ensamble del tren cada 300 a 350 horas de vuelo. Para el uso comercial típico, eso podía significar dejar el avión en tierra por tal mantenimiento cada unas pocas semanas. Inmediatamente, los sistemas organizacionales y sociotécnicos alrededor de la operación de la tecnología comenzaron a adaptarse, y fijaron el sistema en su curso hacia la deriva. Por medio de una variedad de cambios y desarrollos en la guía de mantenimiento para las aeronaves de series DC-9/MD-80, se extendió el intervalo en la lubricación. Como vemos más tarde, estas extensiones difícilmente fueron producto de las solas recomendaciones del fabricante, si es que lo fueron. Una red mucho más compleja y de constante evolución, de comités con representantes de reguladores, fabricantes, subcontratistas y operadores fue el corazón de un desarrollo fragmentado y discontinuo de niveles de mantenimiento, documentación y especificaciones. Rangoderecorridodetrimhorizontal(exagerado) Dirección del vuelo Estabilizador horizontal Elevador BisagrasTuerca Perno Parada inferior
  • 20. La racionalidad de las decisiones sobre los intervalos de mantenimiento fue producida en forma relativamente local, relacionada con una información emergente incompleta sobre lo que era, por todos sus engaños básicos y tecnología no acertada. Si bien cada decisión fue localmente racional, teniendo sentido para quienes tomaron las decisiones en su tiempo y lugar, el cuadro global se volvió uno de deriva hacia el desastre, deriva significante. Comenzando con un intervalo de lubricación de 300 horas, el intervalo al momento del accidente del Alaska 261 se había movido hasta 2.550 horas, casi un orden de mayor magnitud. Como es típico en la deriva hacia la falla, esta distancia no fue sorteada en un intento. El desliz fue incrementando: paso a paso, decisión por decisión. En 1985, la lubricación del perno fue para cumplirse cada 700 horas, cada vez que se realizara el llamado mantenimiento B check (que ocurre cada 350 horas de vuelo). En 1987, el intervalo del B check fue incrementado a 500 horas de vuelo, forzando los intervalos de lubricación a 1.000 horas. En 1988, fueron eliminados todos los B checks, y las tareas a ser cumplidas se distribuyeron sobre los checks A y C. La lubricación de la estructura del perno fue para cumplirse cada octavo A check de 125 horas: aún cada 1.000 horas de vuelo. Pero, en 1991, los intervalos de A check fueron extendidos a 150 horas, dejando la lubricación para cada 1.200 horas. Tres años más tarde, el intervalo del A check fue extendido nuevamente, esta vez para 200 horas. La lubricación ocurriría ahora cada 1.600 horas de vuelo. En 1996, la tarea de lubricación de la estructura del perno fue removida del A check y trasladada a una especie de carta de tarea que especificaba la lubricación cada 8 meses. Ya no estaba acompañada por un límite de horas de vuelo. Para Alaska Airlines, 8 meses se tradujeron a alrededor de 2.550 horas de vuelo. Fig. 2.3. Deriva hacia la falla durante décadas: El intervalo de lubricación del perno se fue extendiendo gradualmente (casi en un factor de 10), hasta el accidente del Alaska Airlines 261. Sin embargo, el perno recuperado del fondo oceánico no revelaba ninguna evidencia de haber tenido una adecuada lubricación en el intervalo previo. Pudo haber habido más de 5.000 horas desde que recibiera una capa de grasa fresca (ver Fig. 2.3.). Con tanta lubricación como había sido recomendada originalmente, Douglas pensó que no había razón para preocuparse por un daño en el hilo. Así que antes de 1867, el fabricante proveyó o recomendó no chequear el hilo de la estructura del perno. Se suponía que el sistema de trim debía acumular 30.000 horas de vuelo antes que necesitara reemplazo. Pero la experiencia operacional reveló un cuadro diferente. 300 700 1000 1200 1600 2550 5000 Deriva en los intervalos de lubricación del perno Años 1966 1985 1988 1991 1994 1996 2000 Horasdevuelo 1 año después de la certificación Intervalo posible para accidente de la aeronave
  • 21. Luego que el DC-9 volara solo un año, Douglas recibió reportes de desgastes en el hilo de la estructura del perno, significativamente excesivos respecto de lo que había sido previsto. En respuesta, el fabricante recomendó que los operadores realizaran un ―end-play check‖ en la estructura del perno, en cada C check, o cada 3.600 horas de vuelo. El end-play check utiliza una fijación limitante que pone presión a la estructura del perno, simulando la carga aerodinámica durante el vuelo normal. La cantidad de juego entre perno y tuerca, medida en milésimas de pulgada, puede ser leída entonces por un instrumento. EL juego es una medida directa de la cantidad de desgaste del hilo. Desde 1985 en adelante, los ―end-play checks‖ en Alaska estuvieron sujetos al mismo tipo de deriva que los intervalos de lubricación. En 1985, los end-play checks fueron programados para cada C check por medio, mientras que los C checks requeridos consistentemente venían en alrededor de 2.500 horas, lo que estaba más bien adelantado de las 3.600 horas de vuelo recomendadas, dejando el avión en tierra innecesariamente. Al programar un ―end-play check‖ cada C check por medio, sin embargo, el intervalo fue extendido a 5.000 horas. En 1988, los intervalos entre C checks fueron extendidos a 13 meses, sin ser acompañados por un límite de horas de vuelo. Los end-play checks fueron ahora efectuados cada 26 meses, o alrededor de cada 6.400 horas. En 1996, los intervalos entre C checks fueron extendidos nuevamente, esta vez a 15 meses. Esto estiró las horas de vuelo entre end-play tests para alrededor de 9.550 horas. El último end-play check del avión del accidente fue realizado en la instalación de mantenimiento de la aerolínea en Oakland, California, en 1997. Para entonces, el juego entre el perno y la tuerca fue encontrado exactamente en el límite de 0,040pulgadas. Esto introdujo una considerable incertidumbre. Con el juego en el límite permisible, ¿qué hacer? ¿Liberar la aeronave y realizar los cambios en la próxima oportunidad, o reemplazar las partes ahora? Las reglas no estaban claras. La llamada AOL 9-48A decía que ―las estructuras de pernos podrían permanecer en servicio tanto como la medida end-play se mantuviera dentro de las tolerancias (entre 0,003 y 0,040 pulgadas)‖ (Nacional Transportation Safety Board, o NTSB, 2002; p. 29). Todavía estaba en 0,040 pulgadas, así que técnicamente la aeronave podría permanecer en servicio. ¿O no podría? ¿Qué tan rápido se produciría el desgaste del hilo desde allí en adelante? Luego de seis días, numerosos cambios de posición y otro end-play check más favorable, la aeronave fue liberada. No se reemplazaron partes: Ni siquiera había en stock en Oakland. La aeronave ―partió a las 03:00 horas local. Hasta entonces muy bien‖, señalaba el plan de rotación de cambio mortal (NTSB, 2002, p. 53). Tres años más tarde, el sistema de trim se fracturó y la aeronave desapareció en el océano no muy lejos. Entre las 2.500 y las 9.550 horas hubo más deriva hacia la falla (ver Fig. 2.4). Nuevamente cada extensión tenía sentido local, y fue únicamente un incremento mayor a la norma establecida previamente. No se violaron normas, no se quebrantaron leyes. Incluso el regulador estuvo de acuerdo con los cambios en los intervalos de end-play check. Era gente normal realizando trabajo normal alrededor de una tecnología notablemente estable y normal. Las figuras de deriva hacia la falla son fáciles de dibujar en retrospectiva. También son fascinantes de observar. Sin embargo, las realidades que representan no generaron una persuasión similar que aquellas en el interior del sistema en ese momento. ¿Por qué podría haber sido notoria esta degeneración numérica, estos números de doble chequeo y servicio? Como una indicación, nunca ser requirió a los técnicos de mantenimiento de MD-80 registrar o hacer un seguimiento del end-play en los sistemas de trim que ellos midieron. Ni siquiera el fabricante ha expresado interés en ver estos números o la lenta y paulatina degeneración que ellos pudieran haber revelado. Si allí hubo deriva, en otras palabras, no podría saberlo una memoria organizacional o institucional. Los cuadros de deriva lo revelan. Pero no arrojan luz acerca del porqué. De hecho, la más grande controversia desde Turner (1978) ha sido elucubrar por qué el desliz hacia el desastre, tan fácil de ver y diagramar en retrospectiva, no la notan aquellos que se la auto inflingen. Juzgar que hubo una falta de previsión, después del hecho, es fácil: Todo lo que se necesita es diagramar los números y percatarse del desliz hacia el desastre. Parado en medio de las ruinas, es fácil maravillarse sobre que tan desorientada o desinformada debió haber estado la gente. Pero ¿por qué las condiciones conducentes a un accidente nunca fueron acordadas o ejecutadas por aquellos en el interior del sistema – aquellos quienes su trabajo era que no ocurrieran tales accidentes? Mirar hacia delante no es mirar hacia atrás. Existe una profunda revisión del interior que cambia en el presente. Convierte el otrora vago y poco probable futuro en un inmediato y cierto pasado.
  • 22. El futuro, dijo David Woods (1993) FIG. 2.4. Más deriva hacia la falla: el intervalo de end-play check (que mide el daño al hilo de la estructura perno-tuerca), fue estirado desde 3.600 a 9.550 horas de vuelo. se ve implausible antes de un accidente (―No, eso no nos va a ocurrir‖). Pero luego de un accidente, el pasado parece increíble (―¡Cómo no pudimos percatarnos que esto nos iba a ocurrir!‖). Lo que ahora parece extraordinario, fue ordinario una vez. Las decisiones, intercambios, preferencias y prioridades que se ven tan fuera de lo ordinario e inmorales luego de un accidente, fueron una vez normales y de sentido común para aquellos que contribuyeron a su incubación. BANALIDAD, CONFLICTO E INCREMENTALISMO. La investigación sociológica (ej. Perrow, 1984; Vaughan, 1996; Weick, 1995), así como también el trabajo presente en factores humanos (Rasmussen & Svedung, 2000) y la investigación en seguridad de sistemas (Leveson, 2002), ha comenzado dibujar los contornos de respuestas sobre el por qué de la deriva. A pesar de ser diferente en el fondo, pedigree y muchos detalles sustantivos, estos trabajos convergen en comuniones importantes acerca de la deriva hacia la falla. El primero es que los accidentes y la deriva que los precede, están asociados a gente normal realizando trabajo normal en organizaciones normales – no con los banales atractivos de la desviación inmoral. Podemos llamar a esto la tesis de banalidad de los accidentes. Segundo, la mayoría de los trabajos tienen en su corazón un modelo de conflicto: Las organizaciones que incorporan trabajo crítico de seguridad esencialmente están tratando de reconciliar metas irreconciliables (permanecer seguro y permanecer en los negocios). Tercero, la deriva hacia la falla es incremental. Los accidentes no ocurren súbitamente, ni se encuentran precedidos por decisiones monumentalmente malas o inmensos pasos lejos de las normas imperantes. La tesis de banalidad de los accidentes dice que el potencial para tener un accidente como un producto normal de realizar negocios normales bajo presiones normales de escasez de recursos y competencia. 3600 5000 6400 9550 Deriva en los intervalos de end-play check Años 1966 1985 1988 1996 2000 Horasdevuelo 1 año después de la certificación Intervalo al momento del accidente
  • 23. Ningún sistema es inmune a las presiones de escasez y competencia, bueno, casi ninguno, El único sistema que alguna vez se ha aproximado al trabajo en un universo de recursos ilimitados fue la NASA, durante los primeros años del Apollo (un hombre tenía que ser puesto en la luna, cualquiera fuera el costo). Ahí había abundancia de dinero y de talento altamente motivado. Pero incluso aquí hubo tecnología no acertada, faltas y fallas no fuera de lo común, y se impusieron restricciones de opinión rápida y tirantemente. Los recursos humanos y el talento comenzaron a drenarse hacia fuera. De hecho, incluso las empresas no comerciales conocen la escasez de recursos: Agencias de gobierno como la NASA o reguladores de seguridad pueden carecer de financiamiento adecuado, personal o capacidad para hacer lo que necesitan hacer. Con respecto al accidente del Alaska 261, por ejemplo, un nuevo programa regulatorio de inspección, llamado el ―Air Transportation Oversight System (ATOS)‖, fue puesto en uso en 1998 (2 años antes). Redujo drásticamente la cantidad de tiempo que los inspectores tenían para las actividades reales de supervigilancia. Un memo de 1999 por un supervisor administrativo regulador en Seattle ofrecía algo en su interior: No somos capaces de satisfacer las demandas de carga de trabajo apropiadamente. Alaska Airlines ha expresado continua preocupación sobre nuestra incapacidad de servirle en una manera de tiempo. Algunas aprobaciones en el programa han sido demoradas o cumplidas en una apresurada forma a la ―undécima hora‖, y anticipamos que este problema se intensificará con el tiempo. Además, muchas investigaciones administrativas… han sido demoradas como resultado de recortes de recursos. (Si el regulador) continua operando con el limitado número de inspectores de aeronavegabilidad existente…la supervigilancia disminuida es inminente y el riesgo de incidentes o accidentes en Alaska Airlines es intensificado. (NTSB, 2002, p. 175) La adaptación a la presión de recursos, las aprobaciones eran demoradas o apresuradas, la supervigilancia era reducida. Sin embargo hacer negocios bajo presiones de escasez de recursos es normal: La escasez y la competencia son parte y paquete incluso de realizar trabajo de inspección. Pocos reguladores en cualquier parte podrán alguna vez afirmar que tienen el tiempo adecuado y los recursos de personal para cumplir con sus mandatos. Sin embargo, el hecho que la presión de recursos es normal, no significa que no existen consecuencias. Desde luego, la presión encuentra vías de escape. Los supervisores escriben memos, por ejemplo. Se pelean batallas sobre los recursos. Se hacen intercambios. La presión se expresa a sí misma en las discusiones políticas sobre recursos y primacía, en preferencias gerenciales para ciertas actividades e inversiones sobre otras, y en casi todos los tratados de ingeniería y operaciones entre extensiones y costo, entre eficiencia y diligencia. De hecho, el trabajo exitoso bajo presiones y restricciones de recursos es una fuente de orgullo profesional: Construir algo que es fuerte y ligero, por ejemplo, marca al experto en ingeniería aeronáutica. Conseguir y dar a luz a un sistema que tenga bajos costos de desarrollo y bajos costos operacionales (típicamente estos son inversos a los otros), es el sueño de la mayoría de los inversionistas y muchos administradores. Ser capaz de crear un programa que permita putativamente inspecciones mejores con menos inspectores puede ganar elogios del servicio civil y oportunidades de ascenso, mientras que los efectos colaterales negativos del programa se sienten primariamente en una lejana oficina de campo. Sin embargo, el motor más grande de la deriva se esconde en alguna parte de este conflicto, en esta tensión entre operar con seguridad y operar del todo, entre construir con seguridad y construir del todo. Esta tensión entrega la energía detrás del lento y constante desencanto de práctica entre normas establecidas previamente o restricciones de diseño. Este desencanto puede eventualmente volverse en deriva hacia la falla. A medida que se pone en uso un sistema, este aprende, y a medida que aprende, se adapta: La experiencia genera información que permite a las personas afinar su trabajo: El afinamiento compensa los problemas y peligros del descubrimiento, remueve la redundancia, elimina los gastos innecesarios y expande las capacidades. La experiencia a menudo capacita a las personas para operar un sistema sociotécnico por un costo mucho menor o para obtener salidas mucho mayores que las asumidas por el diseño inicial. (Starbuck & Milliken, 1988, p. 333) Esta deriva de ajuste fino hacia los márgenes de seguridad operacional es un testimonio hacia los límites del vocabulario de seguridad de sistemas estructuralista en boga hoy en día. Pensamos en las culturas de seguridad como culturas de aprendizaje: culturas que están orientadas hacia el aprendizaje de eventos e incidentes. Pero las culturas de aprendizaje no son ni únicas (ya que cada sistema abierto en un ambiente dinámico necesariamente aprende y se adapta), ni tampoco necesariamente positivas: Starbuck y Milliken vislumbraron cómo una organización pueda hacer un uso ―seguro‖ de la seguridad mientras se alcanzan ganancias en otras áreas. La deriva hacia la falla no podría ocurrir sin aprendizaje. Siguiendo esta lógica, los sistemas que son malos en el aprendizaje y malos en la adaptación, pueden bien tener menor tendencia de deriva hacia la falla.
  • 24. Un ingrediente crítico de este aprendizaje es la aparente insensibilidad para montar evidencia que, desde la posición retrospectiva externa, podría haber mostrado cuan malos son en realidad el juicio y las decisiones. Así es como se ve desde la posición retrospectiva externa: La retrospectiva externa ve una falla o una previsión. Desde el interior, sin embargo, lo anormal es bastante normal, y hacer intercambios en dirección a mayor eficiencia no es para nada inusual. Sin embargo, al hacer estas concesiones, hay un desbalance de retroalimentación. La información de donde una decisión es efectiva para los costos o eficiente puede ser relativamente fácil de obtener. Una hora de arribo adelantada es calculable y tiene beneficios tangibles, inmediatos. Cuanto es o fue tomado prestado de la seguridad para alcanzar una meta, sin embargo, es mucho más difícil de cuantificar y comparar. Si estuvo seguido por un aterrizaje seguro, aparentemente debe haber sido una decisión segura. De manera similar, extender el intervalo de lubricación ahorra inmediatamente una cantidad mesurable de tiempo y dinero, mientras se toma prestado del futuro de un aparentemente problema libre de la estructura del perno. Cada éxito empírico consecutivo (la llegada temprano todavía es un aterrizaje seguro; la estructura del perno todavía es operacional) parece confirmar que el ajuste fino está trabajando bien: El sistema puede operar igualmente seguro, sin embargo, más eficientemente. Como Weick (1993) señaló, sin embargo, la seguridad en esos casos puede no haber sido del todo el resultado de las decisiones que fueron o no fueron hechas, sino más bien una variación estocástica por las capas inferiores, que pende de una serie de otros factores, muchos difícilmente dentro del control de aquellos que encajan en el proceso de ajuste fino. El éxito empírico, en otras palabras, no es prueba de seguridad. El éxito pasado no garantiza la seguridad futura. El tomar prestado más y más de la seguridad puede estar bien por un tiempo, pero nunca se sabe cuando va a golpear. Esto llevó a Langewiesche (1998) a decir que la ley de Murphy estaba equivocada: Todo lo que puede ir mal normalmente va bien, y entonces figuramos la conclusión equivocada. La naturaleza de esta dinámica, este ajuste fino, esta adaptación, es incremental. Las decisiones organizacionales que son vistas como ―malas decisiones‖ luego del accidente (incluso aquellas que se veían como ideas perfectamente buenas para entonces), son rara vez pasos grandes, riesgosos o de gran magnitud. Más bien, existe una sucesión de decisiones crecientemente malas, una larga y continua progresión de pasos incrementales pequeños, que insospechadamente llevan a una organización hacia el desastre. Cada paso lejos de la norma original que se encuentra con el éxito empírico (y no el sacrificio o seguridad obvios) es utilizado como la próxima base desde la cual partir solo ese poquito más, de nuevo. Es este incrementalismo el que hace que tan difícil el distingo entre lo anormal y lo normal. Si la diferencia entre lo que ―debió haberse hecho‖ (o lo que se hizo exitosamente ayer) y lo que es hecho exitosamente hoy, es diminuta, entonces no vale la pena rehacer o reportar sobre esta suave partida desde una norma establecida previa. El incrementalismo es acerca de normalización continuada: Permite normalización y la racionaliza. Deriva hacia la falla y reporte de incidentes. ¿Pueden los reportes de incidentes no revelar una deriva hacia la falla? Esto parece ser un rol natural del reporte de incidentes, pero no es así de simple. La normalización que acompaña la deriva hacia la falla (un end-play check cada 9.550 horas es ―normal‖, incluso aprobado por el regulador, sin importar que el intervalo original era de 2.500 horas) compite severamente con la habilidad de los internos de la organización para definir incidentes. ¿Qué es un incidente? Antes de 1985, fallar al realizar un end-play check cada 2.500 horas podría haber sido considerado un incidente, y suponiendo que la organización tuviera un medio para reportarlo, podría incluso haber sido considerada como tal. Pero por 1996, la misma desviación era normal, incluso regulada. Por 1996, la misma falla ya no era un incidente. Y hubo mucho más. ¿Por qué reportar que la lubricación de la estructura del perno a menudo tiene que ser hecha en la noche, en la oscuridad, fuera del hangar, de pie en el pequeño canasto de un camión elevador, a gran altura del suelo, incluso en la lluvia? ¿Por qué reportar que ud., como un mecánico de mantenimiento tenía que torpemente hacerse camino mediante dos pequeños paneles de acceso que difícilmente dejaban lugar para una mano humana – dejando espacio solo para que los ojos vieran lo que estaba ocurriendo dentro y qué tenía que ser lubricado – si es algo que se tiene que hacer todo el tiempo? En mantenimiento, esto es trabajo normal, es el tipo de actividad requerida para obtener que el trabajo se haga. El mecánico responsable de la última lubricación del avión accidentado le dijo a los investigadores que el había tenido que usar una linterna de cabeza operada por baterías durante tareas de lubricación nocturnas, a forma tal de tener sus manos libres y poder ver algo, a lo menos. Estas cosas son normales, no vale la pena reportarlas. No califican como incidentes. ¿Por qué reportar que los end-play checks son efectuados con una fijación restrictiva (la única en toda la aerolínea, fabricada en casa, en ningún caso cerca de las especificaciones del fabricante), si eso es lo que se usa cada vez que se hace un end-play check?
  • 25. ¿Por qué reportar que los end-play checks, ya sea en la aeronave o en la mesa de trabajo, generan medidas que varían ampliamente, si eso es lo que ellos hacen todo el tiempo, y si es de lo que a menudo se trata el trabajo de mantenimiento? Es normal, no es un incidente. Incluso si la aerolínea hubiera tenido una cultura de reporte, si hubiera tenido una cultura de aprendizaje, si hubiera tenido una cultura justa de forma tal que las personas se sintieran seguras al enviar sus reportes sin temor a retribución, estos podrían no ser incidentes que cambiaran en el sistema. Esta es la tesis de banalidad de accidentes. Estos no son incidentes. En sistemas 10 -7 , los incidentes no preceden accidentes. Lo hace el trabajo normal. En estos sistemas: Los accidentes son de diferente naturaleza de aquellos que ocurren en los sistemas seguros: en este caso los accidentes usualmente ocurren en ausencia de cualquier quiebre serio o incluso de cualquier error serio. Ellos resultan de una combinación de factores, ninguno de los cuales puede por sí solo causar un accidente, o incluso un accidente serio; por lo tanto, estas combinaciones permanecen difíciles de detectar y recuperar utilizando la lógica de análisis de seguridad tradicional. Por la misma razón, reportar se vuelve menos relevante en la previsión de desastres mayores (Amalberti, 2001, p. 112). Incluso si fuéramos a dirigir una fuerza analítica mayor sobre nuestras bases de datos de reportes de incidentes, esto aún podría no devengar ningún valor predictivo para los accidentes más allá de 1 (H, simplemente debido a que los datos no están allí. Las bases de datos no contienen, en un formato visible, los ingredientes de accidentes que ocurren más allá de 10 -7 . Aprender de los incidentes para prevenir los accidentes más allá de 10 -7 bien podría ser imposible. Los incidentes son acerca de fallas y errores independientes, percibidos y perceptibles por personas en el interior. Pero estos errores independientes y fallas ya no hacen aparición en los accidentes que ocurren más allá de 10 -7 . La falla de ver adecuadamente la parte a ser lubricada (la parte crítica para la seguridad, el punto particular, no redundante), la falla en la adecuada y buena realización de un end-play check – nada de esto aparece en los reportes de incidentes. Pero se estima ―causal‖ o ―contribuyente‖ en el reporte del accidente. La etiología de los accidentes en sistemas 10 -7 , entonces, bien puede ser fundamentalmente diferente de la de incidentes, esta vez escondida en los riesgos residuales de hacer negocios normales bajo presiones normales de escasez y competencia. Esto significa que las llamadas hipótesis de causa común (que sostienen que los accidentes e incidentes tienen causas comunes y que los incidentes son cualitativamente idénticos a los accidentes excepto por un pequeño paso), es probablemente erróneo en 10 -7 y más allá: … Reportes de incidentes como Bhopal, Flixbrough, Zeebrugge y Chernobyl demostraron que ellos no habían sido causados por una coincidencia de fallas independientes y errores humanos. Ellos fueron el efecto de una migración sistemática de comportamiento organizacional hacia el accidente bajo la influencia de presión hacia efectividad-costo en un ambiente agresivo, competitivo (Rasmussen y Svedung, 2000, p. 14). En desmedro de este acercamiento, los errores independientes y fallas aún son el retorno mayor de cualquier investigación de accidentes hoy en día. El reporte de la NTSB de 2002, siguiendo la lógica Newtoniano- Cartesiana, habla de deficiencias en el programa de mantenimiento de Alaska Airlines, de defectos y descuidos regulatorios, de responsabilidades no completas, de falencias y fallas y quiebres. Por supuesto, en retrospectiva ellos bien pueden ser sólo eso. Y encontrar falencias y fallas está bien ya que le da al sistema algo que reparar. ¿Pero por qué nadie en ese momento vio estas supuestamente tan aparentes fallas y falencias por lo que (en retrospectiva) eran? Aquí es donde el vocabulario estructuralista de los factores humanos y seguridad de sistemas tradicionales es más limitado, y limitante. Los agujeros encontrados en las capas de defensa (el regulador, el fabricante, el operador, la instalación de mantenimiento y finalmente el técnico), son fáciles de descubrir una vez que los restos están dispersos delante de los pies de uno. En efecto, una crítica común a los modelos estructuralistas es que ellos son buenos para identificar deficiencias o fallas latentes, post-mortem. Sin embargo estas deficiencias y fallas no son vistas como tal, no son fáciles de observar como tal por aquellos en el interior (¡o incluso por aquellos relativamente en el exterior, como el regulador!) antes que ocurra el accidente. De hecho, los modelos estructuralistas pueden capturar muy bien las deficiencias que resultan de la deriva: Ellos identifican acertadamente las fallas latentes o los patógenos residentes en las organizaciones y pueden ubicar agujeros en las capas de defensa. Pero la construcción de fallas latentes, si así se quiere llamar, no está modelada. El proceso de erosión, de desgaste de las normas de seguridad, de deriva hacia los márgenes, no puede ser capturado adecuadamente por los enfoques estructuralistas, por aquellos que son metáforas inherentes para las formas resultantes, no modelos orientados a procesos de formación. Los modelos estructuralistas son estáticos. Aunque los modelos estructuralista de la década de 1990 son llamados a menudo ―modelos de sistemas‖ o ―modelos sistémicos‖, son un lejano llanto de aquello que en realidad es considerado pensamiento de los sistemas (por ejemplo, Capra, 1982). La parte de los sistemas de modelos estructuralistas ha estado limitada por tanto tiempo a identificar y entregar un vocabulario para las estructuras superiores (bordes suaves) detrás de la producción de errores en el borde agudo.
  • 26. La parte de los sistemas de estos modelos es un recordatorio que existe contexto, que no podemos comprender los errores sin ir al fondo organizacional desde el que surgen. Todo esto es necesario, desde luego, mientras que los errores aún son tomados muy a menudo como la conclusión legítima de una investigación (solo mirar al caso del spoiler con ―quiebre en CRM‖ como causa). Pero recordar a la gente el contexto no es sustituto para comenzar a explicar las dinámicas, los procesos incrementales sutiles que lideran y normalizan el comportamiento eventualmente observado. Esto requiere una perspectiva diferente para mirar el desordenado interior de las organizaciones, y un idioma diferente para arrojar adentro las observaciones. Requiere factores humanos y seguridad de sistema para buscar vías de movimiento hacia el pensamiento de sistemas reales, donde los accidentes son vistos como un atributo emergente de procesos de transacciones, ecológicos, orgánicos, más que de sólo el punto final de una trayectoria a través de agujeros en barreras de defensa. Los enfoques estructuralistas, y las reparaciones de las cosas a que nos apuntan, no pueden contribuir mucho para realizar mayor progreso en seguridad: Debemos ser extremadamente sensibles a las limitaciones de los remedios conocidos. Si bien la buena gestión y el buen diseño organizacional pueden reducir los accidentes en determinados sistemas, nunca podrán prevenirlos… Los mecanismos causales en este caso sugieren que las fallas en los sistemas técnicos podrían ser incluso más difíciles de evitar de lo que los más pesimistas entre nosotros pudieron haber creído. El efecto de fuerzas sociales no acordadas e invisibles en la información, conocimiento y – finalmente – en la acción, son muy difíciles de identificar y controlas (Vaughan, 1996, p. 416). Sin embargo, el poder explicatorio retrospectivo de los modelos estructuralistas, los hace instrumentos de selección para aquellos que están a cargo de gestionar la seguridad. De hecho, la idea de una banalidad de accidentes no siempre ha encontrado atracción fuera de los círculos académicos. Por una cosa, asusta. Hace que el potencial para una falla sea rutinario, o inexorablemente inevitable (Vaughan, 1996). Esto puede hacer a los modelos de accidente prácticamente inutilizables y administrativamente desmoralizantes. Si el potencial para una falla está en cualquier parte, en cualquier cosa que hacemos, entonces ¿por qué tratar de evitarlo? Si un accidente no tiene causas en el sentido tradicional, entonces ¿para qué tratar de arreglar cualquier cosa? Tales preguntas son de hecho nihilistas, fatalistas. No es sorprendente, entonces, que la resistencia contra el mundo posible permanezca al acecho tras sus respuestas, tomando muchas formas. Las preocupaciones pragmáticas son orientadas hacia el control, hacia abatir las partes rotas, los chicos malos, los violadores, los mecánicos incompetentes. ¿Por qué este técnico no realizó la última lubricación del perno del avión del accidente como debió haberlo hecho? Las preocupaciones pragmáticas son sobre encontrar las falencias, identificar las áreas débiles y los puntos problemáticos, y repararlos antes que causen problemas reales. Pero esos asuntos pragmáticos no encuentran un oído simpático ni un léxico constructivo en las miserias sobre la deriva hacia la falla, para la deriva hacia la falla es difícil señalar, ciertamente, desde el interior. HACIA EL PENSAMIENTO DE LOS SISTEMAS. Si queremos comprender las fallas más allá de 10 -7 , tenemos que dejar de buscar fallas. Ya no hay fallas que vayan a crear estas fallas – es trabajo normal. De tal forma, la banalidad de los accidentes hace su estudio filosóficamente filistino. Cambia el objeto de examinación lejos de los oscuros lados del mal gobierno corporativo no ético y de la humanidad, y hacia las decisiones ordinarias de personas ordinarias normales bajo la influencia de presiones normales, ordinarias. El estudio de los accidentes es encontrado dramático o fascinante sólo a causa del resultado potencial, no a causa de los procesos que lo incuban (los que en sí mismos pueden ser fascinantes, desde luego). Habiendo estudiado en extenso el desastre del Transbordador Espacial Challenger, Diane Vaughan (1996) estuvo forzada a concluir que este tipo de accidente no es causado por una serie de fallas componentes, incluso si el resultado son las fallas componentes. En efecto, junto con otros sociólogos, ella apuntó a un origen no claro de la equivocación, a errores y quiebres como subproductos normales de los procesos de trabajo de una organización: La equivocación, el accidente y el desastre, están organizados socialmente y son producidos sistemáticamente por estructuras sociales. No hay acciones extraordinarias de individuos que expliquen lo ocurrido: no hay equivocaciones gerenciales intencionales, no hay violaciones a las reglas, no hay conspiración. Estas son equivocaciones imbuidas en la banalidad de la vida organizacional y facilitada por los ambientes de escasez y competencia, tecnología incierta, incrementalismo, patrones de información, rutinización y estructuras organizacionales, (p. xiv). Si queremos comprender, y llegar a ser capaces de prevenir, la falla más allá de 10 -7 , esto es lo que debemos mirar. Olvidemos lo hecho mal. Olvidemos las violaciones a las reglas. Olvidemos los errores. La seguridad, y su carencia, es una propiedad emergente.
  • 27. Lo que debemos estudiar en cambio es patrones de información, las incertidumbres en la tecnología compleja operacional, y los siempre evolutivos e imperfectos sistemas sociotécnicos circundantes para hacer que la operación ocurra, la influencia de la escasez y competencia en esos sistemas, y cómo ellos ponen en movimiento un incrementalismo (en sí misma una expresión del aprendizaje o la adaptación organizacionales bajo esas presiones). Para comprender la seguridad, una organización necesita capturar las dinámicas en la banalidad de su vida organizacional y comenzar a ver cómo el emergente colectivo se mueve hacia los límites del desempeño seguro. Sistemas como Relaciones Dinámicas. El capturar y describir los procesos mediante los que las organizaciones derivan hacia la falla requiere pensamiento de sistemas. El pensamiento de sistemas es acerca de relaciones e integración. Ve a un sistema sociotécnico no como una estructura consistente de departamentos constituyentes, bordes suaves y bordes agudos, deficiencias y fallas, sino como una red compleja de relaciones y transacciones dinámicas, evolucionando. En vez de bloques constructores, el enfoque de sistemas enfatiza los principios de la organización. El entendimiento del todo es bastante distinto del entendimiento de un ensamble de componentes separados. En vez de lazos mecánicos entre componentes (con causa y efecto), ve transacciones – interacciones simultáneas y mutuamente independientes. Tales propiedades emergentes son destruidas cuando un sistema es disectado y estudiado como un lote de componentes aislados (un administrador, departamento, regulador, fabricante, operador). Las propiedades emergentes no existen en los niveles más bajos; incluso ellas no pueden ser descritas significativamente con lenguajes apropiados para esos niveles inferiores. Tomemos los procesos extensos y múltiples mediante los que la guía de mantenimiento se produjo para el DC-9, y luego para la aeronave serie MD-80. Componentes separados (tales como regulador, fabricante, operador) son difíciles de distinguir, y el comportamiento interesante, la clase de comportamiento que contribuye a la deriva hacia la falla, surge sólo como resultado de relaciones y transacciones complejas. A primera vista, la creación de la guía de mantenimiento parece ser un problema resuelto. Usted construye un producto, usted consigue que el regulador lo certifique como seguro de utilizar, y entonces, usted le dice al usuario como mantenerlo en orden para que permanezca seguro. Incluso el segundo paso (obtener que se certifique como seguro) no está por ningún lado cerca de un problema resuelto, y está profundamente entrelazado con el tercero. Más sobre ello luego: Primero la guía de mantenimiento. El Alaska 261 revela una gran brecha entre la producción de un sistema y su operación. Indicios de la brecha aparecieron en observaciones del hilo del perno, que fue su superior a lo esperado por el fabricante. No mucho después de la certificación del DC-9, la gente comenzó el trabajo para intentar puentear la brecha. Reuniendo gente procedente de toda la industria, se instaló un ―Maintenance Guidance Steering Group (MSG)‖, con el fin de desarrollar documentación guía para el mantenimiento de aviones de transporte grandes (NTSB, 2002), particularmente el Boeing 747. Utilizando esta experiencia, otro MSG desarrolló un nuevo documento guía, en 1970, llamado MSG-2 (NTSB, 2002), que tenía la intención de presentar un medio para desarrollar un programa de mantenimiento aceptable para el regulador, el operador y el fabricante. Las muchas discusiones, negociaciones y colaboraciones interorganizacionales, entre líneas del desarrollo de un ―programa de mantenimiento aceptable‖, mostraron que el cómo mantener una pieza de tecnología compleja una vez certificada no era un problema resuelto por completo. De hecho, era más una propiedad emergente: La tecnología probó menos certeza de la que se había apreciado en la pizarra de dibujo (por ejemplo, los rangos del hilo del perno del DC-9 eran más altos que lo previsto), y no fue sino hasta que golpeó en el piso de la práctica que las deficiencias se volvieron aparentes, si uno sabía donde mirar. En 1980, mediante esfuerzos combinados del regulador, grupos de comercio e industria y fabricantes de ambas aeronaves y motores en tanto en Estados Unidos, como en Europa, se produjo un tercer documento guía, llamado MSG-3 (NTSB, 2002). Este documento tenía que clarificar las confusiones previas, por ejemplo, entre mantenimiento ―hard-time‖, mantenimiento ―on-condition‖, mantenimiento ―condition-monitoring‖ y mantenimiento ―overhaul‖. Las revisiones al MSG-3 fueron hechas en 1988 y 1993. Los documentos guía MSG y sus revisiones fueron aceptados por los reguladores, y utilizados por las llamadas ―Maintenance Review Boards (MRB)‖, que convinieron en desarrollar guías para modelos de aeronaves específicos. Una Maintenance Review Board, o MRB, no escribe la guía ella misma, sin embargo; esto es hecho por comités de dirección de la industria, a menudo encabezados por un regulador. Estos comités, en cambio, dirigen varios grupos de trabajo.
  • 28. Mediante todo esto, se obtuvo la producción de documentos llamados planificación de mantenimiento en la aeronave (OAMP), así como también tarjetas de trabajo genéricas que delineaban tareas de mantenimiento específicas. Tanto el intervalo de lubricación, como el end-play check para los pernos de trim de MD-80 fueron los productos cambiantes constantemente, de estas redes envolventes de relaciones entre fabricantes, reguladores, grupos de comercio y operadores, quienes estaban operando fuera de la experiencia operacional continuamente renovada, y una base de conocimiento perpetuamente incompleta sobre la aún incierta tecnología (recordemos, los resultados de los end-play tests no eran grabados ni seguidos). Entonces ¿cuáles son las reglas? ¿Cuáles deben ser los estándares? La introducción de una nueva pieza de tecnología es seguida por negociación, por descubrimiento, por la creación de nuevas relaciones y racionalidades. ―Sistemas técnicos cambiando en modelos por ellos mismos‖, dijo Weingart (1991), ―los observadores de su funcionamiento, y especialmente su malfuncionamiento, en una escala real, es requerida como base para el desarrollo técnico posterior‖ (p. 8). No existen reglas y estándares como señaladores aborígenes, inequívocas, contra una marea de datos operacionales (y si lo hacen, rápidamente están probando ser inútiles o anticuadas). Más bien, las reglas y los estándares son los productos constantemente actualizados de los procesos de conciliación, de dar y tomar, de detección y racionalización de nuevos datos. Como dijo Brian Wynne (1988): Bajo una imagen pública de comportamiento de seguimiento de reglas y la creencia asociada a que los accidentes se deben a la desviación de estas reglas claras, los expertos están operando con niveles lejos más grandes de ambigüedad, necesitando hacer juicios expertos en situaciones estructuradas menos que claras. El punto clave en que sus juicios no son normalmente de una clase - ¿cómo diseñamos, operamos y mantenemos el sistema de acuerdo a ―las‖ reglas? Las prácticas no siguen reglas, más bien, las reglas siguen prácticas envolutivas (p. 153). El establecer los diversos equipos, grupos de trabajo y comités, fue una forma de puentear la brecha entre la construcción y el mantenimiento de un sistema, entre producirlo y operarlo. Pero adaptación puede significar deriva. Y deriva puede significar quiebre. MODELANDO SISTEMAS SOCIOTÉCNICOS VIVOS. ¿Qué clase de modelo de seguridad podría capturar tal adaptación, y predecir un colapso eventual? Los modelos estructuralistas son limitados. Desde luego, podríamos afirmar que la extensión del intervalo de lubricación y el poco fiable end-play check, fueron deficiencias estructurales. ¿Eran agujeros en las capas de defensa? Absolutamente. Pero tales metáforas no nos ayudan a buscar el dónde ocurrió el agujero, o por qué. Hay algo orgánico sobre las MSGs, algo ecológico, que se pierde cuando los modelamos como una barrera de defensa con un agujero en ella; cuando los vemos como una mera deficiencia, o una falla latente. Cuando vemos sistemas como internamente plásticos, flexibles, orgánicos, su funcionamiento es controlado por relaciones dinámicas y adaptación ecológica, más que por estructuras mecánicas rígidas. Además exhiben auto organización (de año en año, el montaje de los MSGs fue diferente) en respuesta a cambios medioambientales, y auto trascendencia: la habilidad para elevarse más allá de los límites conocidos y aprender, desarrollarse e incluso, mejorar. Lo que se necesita aún no es una cuenta estructural del resultado final de deficiencia organizacional. En vez de ello, lo necesario es una cuenta más funcional de procesos vivientes que coevolucionan con respecto a un conjunto de condiciones medioambientales, y que mantienen una relación dinámica y recíproca con esas condiciones (ver Heft, 2001). Tales cuentas necesitan capturar lo que sucede dentro de una organización, con la agrupación de conocimiento y creación de racionalidad dentro de los grupos de trabajo, una vez que la tecnología se encuentra asentada. Una cuenta funcional podría cubrir la organización orgánica de grupos y comités de estiramiento de mantenimiento, ya que su estructura, enfoque, definición de problema y entendimiento, coevolucionaron con las anomalías emergentes y conocimiento creciente sobre una tecnología no acertada. Un modelo sensible a la creación de deficiencias, no sólo a su presencia eventual, da vida a un sistema sociotécnico. Debe ser un modelo de procesos, no sólo uno de estructura. Al extender una genealogía de investigación cibernética y de ingeniería de sistemas, Nancy Leveson (2002) propuso que modelos de control pueden completar parte de esta tarea. Los modelos de control usan las ideas de jerarquías y restricciones para representar las interacciones emergentes de un sistema complejo. En su conceptualización, un sistema sociotécnico consta de diferentes niveles, donde cada nivel superordinado impone restricciones en (o controla lo que está sucediendo en) los niveles subordinados. Los modelos de control están en camino a comenzar a esquematizar las relaciones dinámicas entre los diferentes niveles dentro de un sistema – un ingrediente crítico de moverse hacia el verdadero pensamiento de sistemas (donde las relaciones y transacciones dinámicas son dominantes, no la estructura y los componentes). El comportamiento emergente está asociado con los límites o restricciones en los grados de libertad de un nivel en particular.
  • 29. La división en niveles jerárquicos es un artefacto analítico necesario para ver cómo el comportamiento del sistema puede emerger desde esas interacciones y relaciones. En un modelo de control, los niveles resultantes son desde luego un producto del analista que esquematizó el modelo encima del sistema sociotécnico. Más que reflejos de alguna realidad exterior, los patrones son construcciones de una mente humana buscando respuestas a preguntas particulares. Por ejemplo, un MSG en particular probablemente no vería como está superordinado a algún nivel e imponiendo restricciones en el, o subordinado a algún otro y, por ende, sujeto a sus restricciones. De hecho, una representación jerárquica unidimensional (con sólo arriba y abajo a lo largo de una dirección) probablemente sobre simplifique la red dinámica de relaciones circundantes (y determinando el funcionamiento de) cualquier grupo combinado evolucionante como un MSG. Pero todos los modelos son simplificaciones, y la analogía de los niveles puede ser de ayuda para un analista que tiene cuestiones particulares en mente (por ejemplo, ¿por qué éstas personas, en este nivel, o en este grupo, tomaron esas decisiones, y por qué ellos ven eso como la única forma racional de ir?). El control entre los niveles en un sistema sociotécnico es difícilmente alguna vez perfecto. Para efectivamente controlar, cualquier controlador necesita un buen modelo de lo que se supone que tiene que controlar, y requiere retroalimentación sobre la efectividad de su control. Pero tales modelos internos de los controladores fácilmente se vuelven inconsistentes, y dejan de ser compatibles, con el sistema a ser controlado (Leveson, 2002). Los modelos de control del error son verdaderos especialmente con la tecnología emergente, no asertiva (incluyendo pernos de trim), y con los requerimientos de mantenimiento circundantes. La retroalimentación sobre la efectividad del control está incompleta y también puede ser poco confiable. Una falta de incidentes relacionados con pernos podría entregar la ilusión que el control de mantenimiento es efectivo y que los intervalos pueden ser extendidos, mientras que la rareza del riesgo en realidad depende de factores bastante fuera del espectro del controlador. En este sentido, la imposición de restricciones en los grados de libertad es mutua entre niveles y no sólo en el extremo inferior: Si los niveles subordinados generan retroalimentación imperfecta sobre su funcionamiento, entonces niveles de orden más alto no tienen recursos adecuados (grados de movimiento) para actuar como podría ser necesario. Por ende, el nivel subordinado impone restricciones en el nivel superior al no decirle (o no poder decirle) lo que realmente está sucediendo. Tal dinámica se ha notado en diversos casos de deriva hacia la falla, incluyendo el desastre del Transbordador Espacial Challenger (ver Feynman, 1988). Deriva hacia la falla como erosión de restricciones y pérdida eventual de control. Los espirales de control anidados pueden dar vida a un modelo de un sistema sociotécnico con más facilidad que a una línea de capas de defensa. Para modelar la deriva, esta debe tener vida. La teoría del control ve a la deriva en falla como una erosión gradual de la calidad o del cumplimiento de las restricciones de seguridad en el comportamiento de los niveles subordinados. La deriva resulta ya sea de restricciones perdidas o inadecuadas sobre lo que sucede en otros niveles. El modelar un accidente como una secuencia de eventos, en contraste, sólo está modelando en realidad el producto final de tal erosión y pérdida de control. Si la seguridad es vista como un problema de control, entonces los eventos (tal como los agujeros en las capas de defensa), son los resultados de problemas de control, no de causas que dirigen un sistema hacia el desastre. Una secuencia de eventos, en otras palabras, es cuando más el punto de partida para modelar un accidente, no la conclusión analítica. Los procesos que generan estas debilidades necesitan un modelo. Un tipo de erosión de control ocurre a causa de que las restricciones a la ingeniería original (ejemplo, intervalos de 300 horas) se aflojan como respuesta a la acumulación de experiencia operacional. Una variedad del ―ajuste fino‖ de Starbuck y Milliken (1988), en otras palabras. Esto no significa que la clase de adaptación ecológica en el control del sistema sea completamente racional, o que tiene sentido incluso desde una perspectiva global en la evolución general y sobrevivencia eventual del sistema. No es así. Las adaptaciones suceden, los ajustes se hacen, y las restricciones se aflojan como respuesta a las preocupaciones locales con horizontes de tiempo limitado. Todo ello está basado en el conocimiento incompleto, incierto. A menudo, ni siquiera está claro para los integrantes que las restricciones se han vuelto menos apretadas como resultado en primer lugar de sus decisiones, o que importa si es así. E incluso, cuando se encuentra claro, las consecuencias pueden ser difíciles de anticipar, y juzgadas como una pequeña pérdida potencial en relación con las ganancias inmediatas. Como señaló Leveson (2002), los expertos hacen su mayor esfuerzo para encontrar condiciones locales, y en el ocupado flujo diario y la complejidad de las actividades, ellos podrían no estar alerta de cualquier efecto colateral potencialmente peligroso de esas decisiones. Sólo con el beneficio de la retrospección, o la supervisión omnisciente (que es utópica), que esos efectos colaterales pueden ser ligados al riesgo real. Jensen (1996), lo describe como:
  • 30. No deberíamos esperar que los expertos intervengan, ni tampoco deberíamos creer que ellos sepan siempre lo que están haciendo. A menudo ellos no tienen idea, habiendo estado ciegos a la situación en que se encuentran envueltos. Actualmente, no es inusual que los ingenieros y científicos que trabajan dentro de los sistemas sean tan especializados que ya hace tiempo se dieron por vencidos tratando de entender el sistema como un todo, con todos sus aspectos técnicos, políticos, financiero y sociales (p. 368). Al ser miembro del sistema, entonces, pueden hacer que el pensamiento de sistemas sea todo menos imposible. Perrow (1984), hizo este argumento muy persuasivo, y no sólo para los integrantes del sistema. Un incremento en la complejidad del sistema disminuye su transparencia: Elementos diversos interactúan en una más grande variedad de formas que es difícil de prever, detectar, o incluso, comprender. Las influencias desde fuera de la base del conocimiento técnico (esos ―aspectos políticos, financieros y sociales de Jensen, 1996, p. 368) esfuerzan una presión sutil pero poderosa en las decisiones y tratados que la gente hace, y restringen lo que es visto como una decisión racional o un curso de acción en ese momento (Vaughan, 1996). Por ende, incluso si los expertos pudieran estar bien educados y motivados, una ―advertencia de un incomprensible e inimaginable evento no puede ser vista, ya que no puede ser creída‖ (Perrow, 1984, p. 23). ¿Cómo pueden los expertos y otros encargados de la toma de decisiones, en el interior de los sistemas organizacionales tomar conciencia de los indicadores disponibles del desempeño de la seguridad de sistema? El asegurarse que los expertos y otros encargados de la toma de decisiones están bien informados es en sí misma una persecución vacía. Lo que en realidad significa estar bien informado en un nivel organizacional complejo, es un criterio infinitamente negociable y claro, para lo que constituye información suficiente, son imposibles de obtener. Como resultado, el efecto de las creencias y premisas en la toma de decisiones y la creación de racionalidad, pueden ser considerables. Weick (1995, p. 87), señaló que ―ver lo que uno cree y no ver aquello en lo que uno no cree, es esencial el tener sentido. Las advertencias de lo increíble van sin ser oídas‖. Aquello que no puede ser creído no será visto. Esto confirma que el pesimismo previo sobre el valor del sistema de reporte más allá de 10 -7 . Incluso si eventos y advertencias relevantes terminan en el sistema de reporte (que es dubitativo porque no se ven como advertencias incluso por aquellos que podrían hacer reportar), es incluso más generoso presumir que el análisis experto posterior de tales bases de datos de incidentes podría tener éxito en atraer las advertencias a la vista. La diferencia, entonces, entre la perspicacia experta en el momento y la visión retrospectiva (después de un accidente), es tremenda. Con la visión retrospectiva, los trabajos internos del sistema pueden volverse lúcidos: Las interacciones y efectos colaterales son puestos a la vista. Y con la visión retrospectiva, la gente sabe qué buscar, dónde escarbar por la descomposición, las conexiones perdidas. Detonado por el accidente del Alaska 261, el regulador lanzó una prueba especial en el sistema de control de mantenimiento en Alaska Airlines. Se encontró que los procedimientos establecidos en la compañía no fueron seguidos, que la autoridad y responsabilidad no estaban bien definidas, que el control de los sistemas de aplazamiento de mantenimiento estaba perdido, y que los programas y departamentos de control de calidad y de aseguramiento de la calidad eran ineficientes. Además, encontró papeleo incompleto del C-check, discrepancias de las fechas de expiración de vida útil de las partes, una falta de aprobación de ingeniería de las modificaciones en las tarjetas de trabajo de mantenimiento e inadecuadas calibraciones de herramientas. Los manuales de mantenimiento no especifican procedimientos u objetivos para el entrenamiento de los mecánicos en el trabajo y las posiciones de administración claves (por ejemplo, la seguridad operacional), no eran llenadas o no existían. En realidad, restricciones impuestas en otros niveles organizacionales eran inexistentes, disfuncionales o estaban erosionados. Pero ver agujeros y deficiencias en retrospectiva no es una explicación de la generación o continuidad de existencia de esas deficiencias. No contribuye a prevenir o predecir fallas. En vez de ello, los procesos mediante los cuales tales decisiones se determinan, y mediante cuales los tomadores de decisiones crean su racionalidad local, son una clave al entendimiento de cómo los sistemas pueden erosionarse en el interior de un sistema sociotécnico, complejo. ¿Por qué estas cosas tienen sentido a los tomadores de decisiones en ese momento? ¿Por qué era todo normal, por qué no era digno de reportar, ni siquiera incluso para el regulador encargado de supervigilar estos procesos? Las preguntas penden en el aire. Poca evidencia se encuentra disponible de la (ya inmensa) investigación de la NTSB en tales procesos interorganizacionales, o cómo ellos produjeron una conceptualización particular del riesgo. El reporte, como otros, es testimonio de la tradición mecanística, estructuralista, en las pruebas del accidente para citar, aplicada incluso a las incursiones investigativas en el territorio social-organizacional.
  • 31. La creación de racionalidad local. La pregunta es, ¿cómo los integrantes realizan numerosos intercambios, pequeños y más grandes, que juntos contribuyen a la erosión, a la deriva? ¿Cómo es que estas decisiones aparentemente no dañinas, pueden mover incrementalmente a un sistema hacia el borde del desastre? Como se indicó anteriormente, un aspecto crítico de esta dinámica es que las personas, en los roles de toma de decisiones, en el interior de un sistema sociotécnico pierde o subestima los efectos colaterales globales de sus decisiones localmente racionales. Como ejemplo, el MSG-3 MD-80 MRB (si se pierde ahí no se preocupe, a otras personas debe haberle sucedido también) consideró el cambio en la tarea de lubricación del perno como parte del paquete mayor del C-check (NTSB, 2002). La junta de revisión no consultó a los ingenieros de diseño del fabricante, ni tampoco los puso al tanto de la extensión. El documento OAMP inicial del fabricante para la lubricación del DC-9 y MD- 80, especificaba un ya extendido intervalo de 600 a 900 horas (partiendo desde la recomendación de 1964 para 300 horas), tampoco fue considerada en el MSG-3. Desde una perspectiva local, con la presión de los límites y restricciones de tiempo en el conocimiento disponible, la decisión de extender el intervalo sin el input experto adecuado debe haber tenido sentido. Las personas consultadas en el momento deben haberse estimado suficiente y adecuadamente expertos para sentirse lo suficientemente cómodos para continuar. La creación de racionalidad debe haberse visto como satisfactoria. De otra forma, es difícil creer que el MSG-3 pudiera haber procedido como lo hizo. Pero los efectos colaterales eventuales de estas decisiones menores no fueron previstos. Desde una perspectiva mayor, a la brecha entre producción y operación, entre hacer y mantener un producto, una vez más se le permitió ensancharse. Una relación que debió haber sido instrumental en contribuir a puentear la brecha (consultado con los ingenieros de diseño originales quienes hicieron la aeronave, para informar a aquellos que lo mantuvieron), una relación desde la historia al (entonces) presente, fue desatada. Una transacción no fue completa. Si no tuvo sentido la previsión de efectos colaterales para el MSG-3 MD-80 MRB (y esto bien puede haber sido un resultado banal de la pura complejidad y burocracia del mandato de trabajo), pudo tampoco haber tenido sentido para los participantes en un sistema sociotécnico siguiente. Estas decisiones son sonido cuando se fijan contra el criterio de juicio local; dadas las presiones presupuestarias y de tiempo e incentivos de corto plazo que amoldan el comportamiento. Dado el conocimiento, metas y foco atencional de los encargados de la toma de decisiones, y la naturaleza de los datos disponibles para ellos en ese momento, tiene sentido. Es en estos procesos normales, del día a día, donde podemos encontrar las semillas de la falla y el éxito organizacionales. Y es en estos procesos que debemos volcarnos, para encontrar la influencia para realizar mayores progresos en la seguridad. Como señalaron Rasmussen y Svedung (2000): Para planificar una estrategia de administración del riesgo preactiva, tenemos que comprender los mecanismos que generan el comportamiento real de los tomadores de decisiones a todo nivel… un enfoque a la administración de riesgo preactiva incluye los siguientes análisis: Un estudio de actividades normales de los actores que se encuentran preparando el escenario de los accidentes durante su trabajo normal, junto con un análisis de las apariencias del trabajo que forman su comportamiento de toma de decisiones. Un estudio del ambiente de información presente de estos actores y de la estructura de flujo de información, analizado desde un punto de vista teórico de control (p. 14). La reconstrucción o el estudio del ―ambiente de información‖, en el cual las decisiones reales son conformadas, en que se construye la racionalidad local, puede contribuirnos a penetrar en los procesos de toma de sentido organizacionales. Estos procesos descansan en la raíz del aprendizaje y adaptación organizacionales, y por ende, en la fuente de la deriva hacia la falla. Los dos accidentes de transbordadores espaciales (Challenger en 1986 y columbia en 2002) son altamente instructivos aquí, principalmente porque la Junta de Investigación del Accidente del Columbia (CAIB), así como análisis posteriores del desastre del Challenger (por ejemplo, Vaughan, 1996), representan significativas (y, para citar, más bien únicos) salidas desde las pruebas estructuralistas típicas en tales accidentes. Los análisis toman con seriedad los procesos organizacionales normales hacia la deriva, aplicando e incluso extendiendo un lenguaje que nos contribuya a capturar algo esencial sobre la creación continua de racionalidad local por los encargados de la toma de decisión organizacionales. Un aspecto crítico del ambiente de información en que los ingenieros de NASA tomaron decisiones sobre seguridad y riesgo fue ―bullets (proyectiles)‖. Richard Feynman, quien participó en la Comisión Presidencial Rogers original, investigando el desastre del Challenger, ya había tronado contra ellos y la forma en que habían colapsado los juicios ingenieriles en afirmaciones agrietadas: ―Entonces aprendimos acerca de los ―bullets‖- pequeños círculos negros frente a frases que se supone que debían sintetizar cosas. Había uno después de otro de estos pequeños condenados proyectiles en nuestros libros de briefing y en las diapositivas‖ (Feynman, 1988, p. 127).
  • 32. De manera inquietante, los ―bullets‖ aparecían nuevamente como un recorte en la investigación del accidente del Columbia en 2003. Con la proliferación de software comercial para hacer presentaciones ―bulletized (amunicionadas)‖, desde el Challenger, los proyectiles proliferaron también. Esto también puede haber sido el resultado de intercambios racionales locales (sin embargo, tremendamente irreflectivos) para incrementar la eficiencia: Las presentaciones amunicionadas colapsaron la información y las conclusiones y lidiaron con papeles más rápidos que técnicos. Pero los proyectiles llenaron el ambiente de información de los ingenieros y administradores de NASA, al costo de otros datos y representaciones. Ellas dominaron el discurso técnico hasta un punto, determinando la toma de decisiones en base a lo que podría ser considerado como información suficiente para el asunto a la mano. Las presentaciones amunicionadas fueron esenciales en la creación de racionalidad local, y en empujar a esa racionalidad incluso más lejos del riesgo real mezclándose apenas debajo. Edgard Tufte (CAIB, 2003) analizó una diapositiva en particular del Columbia, de una presentación dada a la NASA por un contratista en febrero de 2003. El ánimo de la diapositiva era ayudar a NASA a considerar el daño potencial a las celdas de calor creado por los restos de hielo que habían caído desde el tanque principal de combustible (celdas de calor dañadas, percutaron la destrucción del Columbia en su viaje de regreso hacia la atmósfera terrestre, ver Fig. 2.5.) La diapositiva fue utilizada por el Equipo de Cálculo de Restos en su presentación para la Sala de Evaluación de la Misión. Fue titulada ―Revisión de los datos de prueba indican conservadurismo para la penetración de la celda‖, sugiriendo, en otras palabras, que el daño hecho al ala no era tan malo (CAIB, 2003, p. 191). Pero en realidad, el título no hacía ninguna referencia al daño de la celda. Más bien, se apuntaba a la elección de modelos de prueba utilizados para predecir el daño. Un título más apropiado, de acuerdo con Tufte, podría haber sido ―Revisión de datos de prueba indican irrelevancia de dos modelos‖. La razón era que la pieza de restos de hielo que golpeó al Columbia, se estimaba que era 640 veces más grande que los datos utilizados para calibrar el modelo en el que los ingenieros basaron sus cálculos del daño (análisis posterior mostró que el objeto de desperdicio era en realidad 400 veces mayor). Así que los modelos de calibración en realidad no fueron de mucho uso: Ellos subestimaron con largueza el impacto real del desperdicio. La diapositiva siguió en decir que ―energía significativa‖ podría haber sido requerida para que el desperdicio del estanque principal penetrara la (supuestamente más dura) envoltura de celdas del ala del transboradador, debido a que los resultados mostraron que esto era posible con suficiente masa y velocidad, y que, una vez que las celdas habían sido penetradas, daño significativo podría haber sido causado. Como Tufte observó, la vagamente cuantitativa palabra ―significativa‖ o ―significativamente‖, fue utilizada cinco veces en una diapositiva, pero su significado cubría todo el rango desde la capacidad de ver utilizando esas pruebas de calibración irrelevantes, a través de una diferencia de 640 pliegues, hasta el daño tan grande que todos los que FIG. 2.5. Ubicación de los impulsores de cohete sólido (Challenger) y tanque de combustible externo (Columbia), en un transbordador espacial. estaban a bordo pudieron morir. La misma palabra, la misma indicación en una diapositiva, repetida cinco veces, acarreando cinco profundamente (sí, significativamente) distintos significados, sin embargo ninguno de ellos era realmente explícito, a causa del condensado formato de la diapositiva. Similarmente, el daño a las celdas protectoras de calor fue oscurecido tras una pequeña palabra, ―eso‖, en una frase que se leía como ―Los resultados de las pruebas muestran que eso es posible con suficiente masa y velocidad‖ (CAIB, 2003, p. Estanque externo de combustible Cohete impulsor sólido (también motor cohete sólido o SRM) Orbitador
  • 33. 191). La diapositiva debilitó material importante, y la naturaleza de amenaza a la vida de los antecedentes en ella se perdió tras proyectiles y afirmaciones abreviadas. Una década y media más tarde, Feynman (1988), descubrió una diapositiva similarmente ambigua acerca del Challenger. En su caso, los proyectiles habían declarado que el sello en erosión en las uniones del campo fue ―más crítico‖ para la seguridad de vuelo, sin embargo, ese ―análisis de los antecedentes existentes indicaron que es seguro continuar volando el diseño existente‖ (p. 137). El accidente probó que no era así. Los impulsores de cohete sólidos (o SRBs o SRMs) que ayudan al transbordador espacial a salir de la atmósfera terrestre, están segmentados, lo que hace más fácil su transporte terrestre y tiene algunas otras desventajas. Un problema que se descubrió temprano en la operación del transbordador, sin embargo, fue que los cohetes sólidos no siempre se sellaban apropiadamente en estos segmentos, y que los gases calientes podrían escaparse a través de los O-rings de goma en el sello, llamado blow-by. Esto eventualmente llevó a la explosión del Challenger, en 1986. La diapositiva de pre-accidente tomada por Feynman había declarado que mientras la falta de un sello secundario en la unión (del motor del cohete sólido) era ―más crítica‖, aún era ―seguro continuar volando‖. Al mismo tiempo, los esfuerzos necesitaban ser ―acelerados‖ para eliminar la erosión del sello del SRM (1988, p. 137). Durante el Columbia, así como también, en el Challenger, las diapositivas no fueron utilizadas sólo para apoyar las decisiones técnicas y operativas que derivaron en los accidentes. Incluso durante ambas investigaciones post-accidente, las diapositivas con presentaciones amunicionadas, fueron ofrecidas como sustitutos para los análisis y datos técnicos, ocasionando que la CAIB (2003), similar a Feynman años antes, alegara que: ―La junta percibe el uso endémico de diapositivas de briefing de PowerPoint, en vez de papeles técnicos, como una ilustración de los métodos problemáticos de comunicación técnica en la NASA‖ (p. 191). La sobre utilización de proyectiles y diapositivas, ilustra el problema de ambientes de información y cómo el estudiarlos puede contribuirnos a entender algo sobre la creación de racionalidad local en la toma de decisiones organizacional se configura como un ―nicho epistémico‖ (Hoven, 2001). El que esos tomadores de decisiones puedan saber, está generado por otras personas, y se distorsiona durante la transmisión a través de un medio reduccionista, abreviativo (este nicho epistémico también tiene implicancias en cómo podemos pensar acerca de la culpa, o la culpabilidad de las decisiones y quienes las toman –ver cáp. 10). Lo restringido e incompleto del nicho en el que los tomadores de decisiones se encuentran puede venir como inquietante para los observadores retrospectivos, incluyendo la gente dentro y fuera de la organización. Fue después del accidente del Columbia que el Equipo de Administración de la Misión ―admitió que el análisis utilizado para continuar volando era, en una palabra, ―pésima‖. Esta admisión – que lo racional para volar era sellado en goma – es, por decirlo menos, inquietante‖ (CAIB, 2003, p. 190). ―Inquietante‖ puede ser, y probablemente lo es – en retrospectiva. Pero desde el interior, la gente en las organizaciones no gasta una vida profesional tomando decisiones ―inquietantes‖. Más bien, ellos realizan trabajo generalmente normal. Nuevamente, ¿cómo puede un administrador ver un proceso ―pésimo‖ para evaluar la seguridad de vuelo como normal, y no como algo que es digno de reportar o reparar? ¿Cómo es posible que sea normal este proceso? La CAIB (2003), por sí misma, encontró claves a las respuestas en las presiones de escasez y competencia: Se supone que el proceso de estar Listo para el Vuelo debe estar escudado de la influencia externa, y es visto tanto riguroso como sistemático. Sin embargo, el Programa del Transbordador está inevitablemente influenciado por factores externos, incluyendo, en el caso de STS-107, demandas de agenda. Colectivamente, tales factores dan forma a cómo el Programa establece calendarios de misión y fija prioridades financieras, las que afectan la visión general de la seguridad, niveles de fuerza de trabajo, mantenimiento de instalaciones y cargas de trabajo de contratistas. Finalmente, las esperanzas y presiones externas impactan incluso en la recolección de datos, análisis de tendencias, desarrollo de información y reporte y disposición de anomalías. Estas realidades contradicen a la creencia optimista de NASA que las revisiones de prevuelo entregan salvaguardias reales contra los riesgos inaceptables (2003, p. 191). Quizá no exista tal cosa como la toma de decisiones ―rigurosa y sistemática‖, basada en la mera experiencia técnica. Las esperanzas y las presiones, las prioridades presupuestarias y calendarios de misión, las cargas de trabajo a los contratistas, y los niveles de fuerza de trabajo, todos impactan en la toma de decisiones técnica. Todos estos factores determinan y restringen lo que será visto como cursos de acción posibles y racionales en el momento. Esto viste el nicho epistémico en que los tomadores de decisiones se encuentran en frecuencias y patrones bastante más variadas que los antecedentes técnicos únicamente. Pero al suponer que algunos tomadores de decisiones podrían ver a través de todos estos vestidos en el interior de sus nichos epistémicos, y alertar a otros a hacerlo. Existen historias de tales delatores. Incluso la información de un nicho epistémico (el ambiente de información, podría haber sido vista y acordada desde el interior en el momento, lo que aún no significa que garantice cambio o mejora. El nicho, y la forma en que la gente se configure en él, responden a otras preocupaciones y presiones que están activas en la organización – eficiencia y velocidad de procesos de briefing y toma de decisiones, por ejemplo.
  • 34. El impacto de esta información imperfecta, incluso si es acordada, es subestimado debido a que al ver los efectos colaterales, o las conexiones al riesgo real, rápidamente planea hacia fuera de las capacidades computacionales de los tomadores de decisión organizacionales y de los mecanismos del momento. Al estudiar los ambientes de información, cómo están creados, sostenidos y racionalizados, y en cambio, cómo contribuyen a apoyar y racionalizar las decisiones complejas y riesgosas, es un camino al entendimiento organizacional de la toma de sentido. Se dirá más de estos procesos de toma de sentido en alguna parte de este libro. Es una forma de hacer lo que los sociólogos llaman la conexión macromicro. ¿Cómo es que esas presiones globales de producción y escasez, encuentran su camino en los nichos de decisiones locales, y cómo es que ellos ejercitan entonces su a menudo invisible pero poderosa influencia en lo que la gente cree y prefiere, lo que la gente entonces y allá ve como racional o indistinguible? Ya que la intención fue que las evaluaciones de seguridad de vuelo de la NASA estuvieran escudadas de esas presiones externas, no es menos cierto que esas presiones se filtraron incluso hasta la recolección de datos, análisis de tendencias y reporte de anomalías. Los ambientes de información creados como consecuencia por los tomadores de decisiones fueron continua e insidiosamente tentados por las presiones de producción y escasez (¿y en qué organización no lo son?), influenciando preracionalmente la forma en que la gente veía el mundo. Sin embargo, incluso este ―pésimo‖ proceso fue considerado como normal – normal o suficientemente inevitable, en cualquier caso, para no garantizar el gasto de energía y capital político en tratar de cambiar. El resultado puede ser la deriva hacia la falla. RESISTENCIA DE INGENIERÍA EN EL INTERIOR DE LAS ORGANIZACIONES. En todos los sistemas abiertos está continuamente siendo corregida la deriva, dentro de sus envoltorios de seguridad. Las presiones de escasez y competencia, la falta de transparencia y el tamaño de los sistemas complejos, los patrones de información que rodean a quienes toman decisiones, y la naturaleza incrementalista de sus decisiones a través del tiempo, puede causar que los sistemas deriven hacia la falla. La deriva está generada por procesos normales de reconciliación de las presiones diferenciales en una organización (eficiencia, utilización de la capacidad, seguridad) contra un fondo de tecnología no asertiva y conocimiento imperfecto. La deriva es sobre el incrementalismo contribuyendo a eventos extraordinarios, sobre la transformación de las presiones de escasez y competencia en mandatos organizacionales, y sobre la normalización de señales de peligro de forma tal que las metas organizacionales y sobre la normalización de las señales de peligro de forma tal que las metas organizacionales y los cálculos y decisiones supuestamente normales se pongan alineados. En los sistemas seguros, los procesos reales que normalmente garantizan la seguridad y generan el éxito organizacional, pueden también ser responsables de la ruina de la organización. La misma vida sociotécnica interconectada, compleja, que rodea a la operación de tecnología exitosa, es en gran medida, responsable por su falla potencial. Porque estos procesos son normales, porque son parte y parcela de la vida organizacional funcional, normal, son difíciles de identificar y soltar. El rol de estas fuerzas invisibles y no acordadas, puede ser atemorizante. Consecuencias dañinas pueden ocurrir en organizaciones construidas para prevenirlas. Consecuencias dañinas pueden ocurrir incluso cuando todos siguen las reglas (Vaughan, 1996). La dirección en que la deriva presiona a la operación de la tecnología puede ser difícil de detectar, también o quizá especialmente debido a aquellos en su interior. Puede ser incluso difícil de detener. Dada la diversidad de fuerzas (presiones políticas, financieras y económicas, incertidumbre tecnológica, conocimiento incompleto, procesos de resolución de problemas fragmentados) tanto en el interior como en el exterior, los sistemas sociotécnicos, grandes, complejos, que operan algunas de nuestras más peligrosas tecnologías hoy, parecen capaces de generar una energía oscura y derivar a voluntad, a ser relativamente inmunes a la inspección externa o control interno. Recordemos que, en un vuelo normal, se supone que la estructura del perno de un MD-80 debería soportar una carga de alrededor de 5.000 libras. Pero en realidad, esta carga nació de un sistema débil, poroso y continuamente cambiante, de enseñanza enferma y procedimientos no prácticos, delegados al nivel operador que rutinariamente, pero siempre sin éxito, intentó cerrar la brecha entre producción y operación, entre fabricar y mantener. Cinco mil libras de carga en una suelta y variable colección de procedimiento y prácticas fueron lenta e incrementalmente excavando su vía a través de las amenazas a la tuerca. Fue el sistema sociotécnico diseñado para apoyar y proteger la tecnología incierta, no la parte mecánica, lo que tenía que llevar esa carga. Cedió. El reporte del accidente acordó que eliminar el riesgo de las fallas catastróficas simples puede no ser siempre posible a través del diseño (ya que el diseño es una reconciliación entre restricciones irreconciliables). Concluyó que ―cuando no existen alternativas de diseño practicables, es necesario un proceso de inspección y mantenimiento sistémico comprensivo‖ (NTSB, 2002, p. 180).
  • 35. La conclusión, en otras palabras, era tener un sistema no redundante (el perno simple y el tubo de torque) hecho redundante a través de un conglomerado regulatorio organizacional de verificación de mantenimiento y aeronavegabilidad. El reporte fue forzado a concluir que el último resorte debió ser una contramedida en la que ya se habían gastado 250 páginas probando que no funciona. La deriva hacia la falla posee riesgo sustancial para los sistemas seguros. Reconocer y redireccionar la deriva es una competencia que yace ante cualquier organización en la frontera de 10 -7 . Ningún sistema de transporte en uso hoy en día ha atravesado esta barrera, y el éxito en romper con la asíntota en el progreso en la seguridad no tendría que venir con extensiones de los enfoques estructuralistas mecánicos en vigencia. La seguridad es una propiedad emergente, y su erosión no es sobre la fractura o falta de calidad de componentes particulares. Esto hace a la combinación de gestión con calidad y gestión de seguridad, contraproducentes. Muchas organizaciones tienen a la gestión de seguridad y gestión con calidad, envueltas en una función o departamento. Sin embargo, la gestión con calidad es sobre componentes particulares, sobre ver cómo alcanzar especificaciones particulares, sobre remover o reparar componentes defectuosos. La gestión de seguridad tiene ahora poco que hacer con componentes particulares. Es un completamente diferente nivel de entendimiento, un vocabulario completamente diferente es necesario para comprender la seguridad, en contraste con la calidad. La deriva hacia la falla no es tanto sobre quiebres o malfuncionamiento de componentes, como sí lo es acerca de una organización no adaptándose efectivamente para poder con la complejidad de su propia estructura y medioambiente. La resistencia organizacional no es una propiedad, es una capacidad: Una capacidad para reconocer los límites de las operaciones seguras, una capacidad para apartarse de ellos en una forma controlada, una capacidad para recuperarse de una pérdida de control si ocurre. Esto significa que los factores humanos y la seguridad de sistemas deben encontrar nuevas formas de resistencia de ingeniería dentro de las organizaciones, de equipar las organizaciones con una capacidad para reconocer, y recuperarse de, una pérdida de control. ¿Cómo puede una organización monitorear sus propias adaptaciones (y cómo esto raya en la racionalidad de quienes toman decisiones) a las presiones de escasez y competencia, porque ello puede ser cuando tales inversiones más se necesiten. Prevenir una deriva hacia la falla requiere una clase diferente de aprendizaje y monitoreo organizacional. Significa reparar sobre variables de orden superior, sumando un nuevo nivel de inteligencia y análisis al reporte de incidentes y al conteo del error que es hecho hoy en día. Más sobre esto será dicho en los capítulos siguientes. Capítulo 3 ¿Por qué son más peligrosos los Doctores que los Propietarios de Armas? Existen alrededor de 700.000 físicos en Estados Unidos. El Institudo de Medicina de Estados Unidos estima que entre 44.000 y 98.000 personas mueren como resultado de errores médicos (Kohn, Corrigan % Donaldson, 1999). Esto hace una tasa por muerte accidental anual por doctor de entre 0,063 y 0,14. En otras palabras, hasta uno de cada siete doctores matará a un paciente cada año, por equivocación. Tomemos a los propietarios de armas en contraste. Existen 80.000.000 de propietarios de armas en Estados Unidos. Sin embargo, sus errores llegan ―sólo‖ a 1.500 muertes accidentales por armas al año. Esto significa que la tasa de muerte accidental causada por error de propietarios de armas, es de 0,000019 por propietario de arma al año. Sólo 1 de 53.000 propietarios matará a alguien por error. Los doctores entonces, son 7.500 veces más propensos a matar a alguien por error. Mientras que no todo el mundo tiene un arma, casi todos tienen un doctor (o muchos doctores) y están, por ende, severamente expuestos al problema del error humano. Como las organizaciones y otros grupos de presión (por ejemplo, grupos de la industria y el comercio, reguladores) intentan calcular la ―salud de seguridad‖ de sus operaciones, contar y tabular errores parece ser una medida significativa. No sólo entrega una estimación numérica inmediata, de la probabilidad de muerte accidental, lesiones, o cualquier otro evento indeseable, además permite la comparación de sistemas y componentes de ello (este hospital vs. ese hospital, esta aerolínea vs. esa, esta flota de aeronaves o pilotos vs. esa, estas rutas vs. esas rutas). El mantener un seguimiento de los eventos adversos está pensado para entregar acceso relativamente simple, fácil y certero, a los trabajos de seguridad internos en un sistema. Más aún, los eventos adversos pueden ser vistos como la partida – o la razón– para probar más a fondo, con el fin de buscar las amenazas medioambientales o las condiciones desfavorables que pueden ser cambiadas para prevenir su recurrencia. Desde luego, también está la pura curiosidad científica de tratar de entender diferentes tipos de eventos adversos, diferentes tipos de errores. El categorizar, después de todo, ha sido fundamental para la ciencia desde los tiempos modernos. Sobre las décadas pasadas, los factores humanos en el transporte se han empecinado en cuantificar los problemas de seguridad y encontrar fuentes potenciales de vulnerabilidad y falla. Han engendrado una cantidad de sistemas de clasificación del error.
  • 36. Algunos clasifican a los errores de decisión junto con las condiciones que han contribuido a su producción, algunos tienen una meta específica, por ejemplo, categorizar los problemas de transferencia de información (por ejemplo, instrucciones, errores durante la observación de los briefings de cambio, fallas de coordinación); otros tratan de dividir las causas del error en factores cognitios, sociales y situacionales (físicos, medioambientales, ergonómicos); sin embargo, otros intentan clasificar las causas del error a lo largo de lineas de un modelo linear de procesamiento de la información, o o un modelo de toma de decisiones, y algunos aplican la metáfora del queso suizo (por ejemplo, los sistemas tienen muchas capas de defensa, pero todas ellas tienen agujeros) en la identificación de errores y vulnerabilidades hasta la cadena causal. Los sistemas de clasificación del error son usados tanto luego de un evento (por ejemplo, durante las investigaciones de incidentes) o para observaciones del rendimiento humano corriente. MIENTRAS MAS MEDIMOS, MENOS SABEMOS. En perseguir la categorización y tabulación de los errores, factores humanos hace una cantidad de suposiciones y adopta ciertas posiciones filosóficas. Poco de esto está explicado en la descripción de estos métodos, y sin embargo, acarrea consecuencias para la utilidad y calidad del conteo del error como una medida de bienestar de seguridad y como herramienta para dirigir los recursos para el mejoramiento. Aquí hay un ejemplo. En uno de los métodos, se pide al observador que distinga entre ―errores de procedimiento‖ y ―errores de eficiencia‖. Los errores de eficiencia están relacionados con una falta de habilidades, experiencia o práctica (reciente), mientras que los errores de procedimiento son aquellos que ocurren al llevar a cabo secuencias de acción prescritas o normadas (por ejemplo, listas de verificación). Esto se ve directo y sencillo. Sin embargo, como reportó Croft (2001), el siguiente problema confronta el observador: un tipo de error (un piloto ingresando una altitud errada en el computador de vuelo) puede legítimamente terminar en cualquiera de las dos categorías del método de conteo del error (un error de procedimiento o un error de eficiencia). ―Por ejemplo, ingresar la altitud errada en el sistema de administración de vuelo (FMS), es considerado un error de procedimiento… No saber cómo utilizar ciertos equipos automatizados en un computador de vuelo de una aeronave es considerado un error de eficiencia‖ (p. 77). Si un piloto ingresa la altitud erronea en el FMS ¿es un asunto de procedimiento, de eficiencia o ambos? ¿Cómo debe ser categorizado? Thomas Kuhn (1962), impulsó a la ciencia a girar hacia la filosofía creativa cuando se enfrentó con las inclinaciones de los problemas en relacionar la teoría con las observaciones (como en el problema de categorizar una observación en clases teóricas). Puede ser una manera efectiva de dilucidar y, si es necesario, debilitar el control de una tradición en la mente colectiva, y sugerir las bases para una nueva. Esto es ciertamente apropiado cuando surgen las preguntas epistemiológicas: preguntas de cómo estamos en el conocimiento de lo que (creemos) conocemos. Para entender la clasificación del error y algunos de sus problemas asociados, debemos intentar encajar en un análisis breve de la tradición filosófica contemporánea que gobierna la investigación de factores humanos y la visión global en que tiene lugar. Realismo: Los errores existen: Los puedes descubrir con un buen método. La posición que factores humanos toma cuando utiliza herramientas de observación para medir los ―errores‖, es una realista: Presume que hay mundo real, objetivo, con patrones de verificación que pueden ser observados, categorizados y previstos. En este sentido, los errores son una clase de hecho Durkheimiano. Emile Durkheim, un padre fundador de la sociología, creyó que la realidad social está objetivamente ―allá afuera‖, disponible para escrutinio empírico imparcial, neutral. La realidad existe, vale la pena luchar por la verdad. Por supuesto, existen obstáculos para obtener la verdad, y la realidad puede ser difícil de mantener. Sin embargo, el perseguir una diagramación cercana o una correspondencia a esa realidad es una meta válida y legítima del desarrollo teórico. Es esa meta, ella. truth is worth striving for. Of course, there are obstacles to getting to the truth, and reality can be hard to pin down. Yet pursue a close map or correspondence to that reality is a valid, legitimate goal of theory development. It is that goal, of achieving a close mapping to reality, that governs error-counting methods. If there are difficulties in getting that correspondence, then these difficulties are merely methodological in nature. The difficulties call for refinement of the observational instruments or additional training of the observers. These presumptions are modernist; inherited from the enlightened ideas of the Scientific Revolution. In the finest of scientific spirits, method is called on to direct the searchlight across empirical reality, more method is called on to correct ambiguities in the observations, and even more method is called on to break open new portions of hitherto unexplored empirical reality, or to bring into focus those portions that so far were vague and elusive. Other labels that fit such an approach to empirical reality could include positivism, which holds that the only type of knowledge worth bothering with is that which is based directly on experience. Positivism is
  • 37. associated with the doctrine of Auguste Comte: The highest, purest (and perhaps only true) form of knowledge is a simple description of sensory phenomena. In other words, if an observer sees an error, then there was an error. For example, the pilot failed to arm the spoilers. This error can then be written up and categorized as such. But positivist has obtained a negative connotation, really meaning "bad" when it comes to social science research. Instead, a neutral way of describing the position of error-counting methods is realist, if naively so. Oper-49 3. DOCTORS AND GUN OWNERS ating from a realist stance, researchers are concerned with validity (a measure of that correspondence they seek) and reliability. If there is a reality that can be captured and described objectively by outside observers, then it is also possible to generate converging evidence with multiple observers, and consequently achieve agreement about the nature of that reality. This means reliability: Reliable contact has been made with empirical reality, generating equal access and returns across observations and observers. Error counting methods rely on this too: It is possible to tabulate errors from different observers and different observations (e.g., different flights or airlines) and build a common database that can be used as some kind of aggregate norm against which new and existing entrants can be measured. But absolute objectivity is impossible to obtain. The world is too messy for that, phenomena that occur in the empirical world too confounded, and methods forever imperfect. It comes as no surprise, then, that error- counting methods have different definitions, and different levels of definitions, for error, because error itself is a messy and confounded phenomenon: • Error as the cause of failure, for example, the pilot's failure to arm the spoilers led to the runway overrun. • Error as the failure itself: Classifications rely on this definition when categorizing the kinds of observable errors operators can make (e.g., decision errors, perceptual errors, skill-based errors; Shappell & Wiegmann, 2001) and probing for the causes of this failure in processing or performance. According to Helmreich (2000), "Errors result from physiological and psychological limitations of humans. Causes of error include fatigue, workload, and fear, as well as cognitive overload, poor interpersonal communications, imperfect information processing, and flawed decision making" (p. 781). • Error as a process, or, more specifically, as a departure from some kind of standard: This standard may consist of operating procedures. Violations, whether exceptional or routine (Shappell & Wiegmann), or intentional or unintentional (Helmreich), are one example of error according to the process definition. Depending on what they use as standard, observers of course come to different conclusions about what is an error. Not differentiating among these different possible definitions of error is a well-known problem. Is error a cause, or is it a consequence? To the errorcounting methods, such causal confounds and messiness are neither really surprising nor really problematic. Truth, after all, can be elusive. What matters is getting the method right. More method may solve problems of method. That is, of course, if these really are problems of method. The modernist would say "yes." "Yes" would be the stock answer from the Scientific Revolution onward. Methodological wrestling with empirical reality, 50 CHAPTER 3 where empirical reality plays hard to catch and proves pretty good at the game, is just that: methodological. Find a better method, and the problems go away. Empirical reality will swim into view, unadulterated. Did You Really See the Error Happen? The postmodernist would argue something different. A single, stable reality that can be approached by the best of methods, and described in terms of correspondence with that reality, does not exist. If we describe reality in a particular way (e.g., this was a "procedure error"), then that does not imply any type of mapping onto an objectively attainable external reality—close or remote, good or bad. The postmodernist does not deal in referentials, does not describe phenomena as though they reflect or represent something stable, objective, something "out there." Rather, capturing and describing a phenomenon is the result of a collective generation and agreement of meaning that, in this case, human factors researchers and their industrial counterparts have reached. The reality of a procedure error, in other words, is socially constructed. It is shaped by and dependent on models and paradigms of knowledge that have evolved through group consensus. This meaning is enforced and handed down through systems of observer training, labeling and communication of the results, and industry acceptance and promotion. As philosophers like Kuhn (1962) have pointed out, these paradigms of language and thought at some point adopt a kind of self-sustaining energy, or "consensus authority" (Angell & Straub, 1999). If human factors auditors count errors for managers, they, as (putatively scientific) measurers, have to presume that errors exist. But in order to prove that errors exist, auditors have to measure them. In other words, measuring errors becomes the proof of their existence, an existence that was preordained by their measurement. In the end, everyone agrees that counting errors is a good step forward on safety because almost everyone seems to agree that it is a good step forward.
  • 38. The practice is not questioned because few seem to question it. As the postmodernist would argue, the procedural error becomes true (or appears to people as a close correspondence to some objective reality) only because a community of specialists have contributed to the development of the tools that make it appear so, and have agreed on the language that makes it visible. There is nothing inherently true about the error at all. In accepting the utility of error counting, it is likely that industry accepts its theory (and thereby the reality and validity of the observations it generates) on the authority of authors, teachers, and their texts, not because of evidence. In his headline, Croft (2001) announced that researchers have now perfected ways to monitor pilot performance in the cockpit. "Researchers" have "perfected." There is little that 51 DOCTORS AND GUN OWNERS an industry can do other than to accept such authority. What alternatives have they, asks Kuhn, or what competence? Postmodernism sees the "reality" of an observed procedure error as a negotiated settlement among informed participants. Postmodernism has gone beyond common denominators. Realism, that product and accompaniment of the Scientific Revolution, assumes that a common denominator can be found for all systems of belief and value, and that we should strive to converge on those common denominators through our (scientific) methods. There is a truth, and it is worth looking for through method. Postmodernism, in contrast, is the condition of coping without such common denominators. According to postmodernism, all beliefs (e.g., the belief that you just saw a procedural error) are constructions, they are not uncontaminated encounters with, or representations of, some objective empirical reality. Postmodernism challenges the entire modernist culture of realism and empiricism, of which error counting methods are but an instance. Postmodernist defiance not only appears in critiques against errorcounting but also reverberates throughout universities and especially the sciences (e.g., Capra, 1982). It never comes away unscathed, however. In the words of Varela, Thompson, and Rosch (1991), we suffer from "Cartesian anxiety." We seem to need the idea of a fixed, stable reality that surrounds us, independent of who looks at it. To give up that idea would be to descend into uncertainty, into idealism, into subjectivism. There would be no more groundedness, no longer a set of predetermined norms or standards, only a constantly shifting chaos of individual impressions, leading to relativism and, ultimately, nihilism. Closing the debate on this anxiety is impossible. Even asking which position is more "real" (the modernist or the postmodernist one) is capitulating to (naive) realism. It assumes that there is a reality that can be approximated better either by the modernists or postmodernists. Was This an Error? It Depends on Who You Ask Here is one way to make sense of the arguments. Although people live in the same empirical world (actually, the hard-core constructionist would argue that there is no such thing), they may arrive at rather different, yet equally valid, conclusions about what is going on inside of it, and propose different vocabularies and models to capture those phenomena and activities. Philosophers sometimes use the example of a tree. Though at first sight an objective, stable entity in some external reality, separate from us as observers, the tree can mean entirely different things to someone in the logging industry as compared to, say, a wanderer in the Sahara. Both interpretations can be valid because validity is measured in terms of local relevance, situational applicability, and social acceptability—not in terms of 52 CHAPTER 3 correspondence with a real, external world. Among different characterizations of the world there is no more real or more true. Validity is a function of how the interpretation conforms to the worldview of those to whom the observer makes his appeal. A procedure error is a legitimate, acceptable form of capturing an empirical encounter only because there is a consensual system of like-minded coders and consumers who together have agreed on the linguistic label. The appeal falls onto fertile ground. But the validity of an observation is negotiable. It depends on where the appeal goes, on who does the looking and who does the listening. This is known as ontological relativism: There is flexibility and uncertainty in what it means to be in the world or in a particular situation. The ontological relativist submits that the meaning of observing a particular situation depends entirely on what the observer brings to it. The tree is not just a tree. It is a source of shade, sustenance, survival. Following Kant's ideas, social scientists embrace the common experience that the act of observing and perceiving objects (including humans) is not a passive, receiving process, but an active one that engages the observer as much as it changes or affects the observed. This relativism creates the epistemological uncertainty we see in error- counting methods, which, after all, attempt to shoehorn observations into numerical objectivity. Most social observers or error coders will have felt this uncertainty at one time or another. Was this a procedure error, or a proficiency error, or both? Or was it perhaps no error at all? Was this the cause, or was it the consequence? If it is up to Kant, not having felt this uncertainty would serve as an indication of being a particularly obtuse observer. It would certainly not be proof of the epistemological astuteness of either method or error counter. The uncertainty suffered by them is epistemological because it is realized that certainty about what we know, or even about how to know whether we know it or not, seems out of reach.
  • 39. Yet those within the ruling paradigm have their stock answer to this challenge, just as they have it whenever confronted with problems of bringing observations and theories in closer correspondence. More methodological agreement and refinement, including observer training and standardization, may close the uncertainty. Better trained observers will be able to distinguish between a procedure error and proficiency error, and an improvement to the coding categories may also do the job. Similar modernist approaches have had remarkable success for five centuries, so there is no reason to doubt that they may offer routes to some progress even here. Or is there? Perhaps more method may not solve problems seemingly linked to method. Consider a study reported by Hollnagel and Amalberti (2001), whose purpose was to test an error-measurement instrument. This instrument was designed to help collect data on, and get a better understanding of, air-traffic controller errors, and to identify areas of weakness and find 53 DOCTORS AND GUN OWNERS possibilities for improvement. The method asked observers to count errors (primarily error rates per hour) and categorize the types of errors using a taxonomy proposed by the developers. The tool had already been used to pick apart and categorize errors from past incidents, but would now be put to test in a real-time field setting—applied by pairs of psychologists and air-traffic controllers who would study air-traffic control work going on in real time. The observing air traffic controllers and psychologists, both trained in the error taxonomy, were instructed to take note of all the errors they could see. Despite common indoctrination, there were substantial differences between the numbers of errors each of the two groups of observers noted, and only a very small number of errors were actually observed by both. People watching the same performance, using the same tool to classify behavior, came up with totally different error counts. Closer inspection of the score sheets revealed that the air-traffic controllers and psychologists tended to use different subsets of the error types available in the tool, indicating just how negotiable the notion of error is: The same fragment of performance means entirely different things to two different (but similarly trained and standardized) groups of observers. Air-traffic controllers relied on external working conditions (e.g., interfaces, personnel and time resources) to refer to and categorize errors, whereas psychologists preferred to locate the error somewhere in presumed quarters of the mind (e.g., working memory) or in some mental state (e.g., attentional lapses). Moreover, air-traffic controllers who actually did the work could tell both groups of error coders that they both had it wrong. Debriefing sessions exposed how many observed errors were not errors at all to those said to have committed them, but rather normal work, expressions of deliberate strategies intended to manage problems or foreseen situations that the error counters had either not seen, or not understood if they had. Croft (2001) reported the same result in observations of cockpit errors: More than half the errors revealed by error counters were never discovered by the flight crews themselves. Some realists may argue that the ability to discover errors that people themselves do not see is a good thing: It confirms the strength or superiority of method. But in Hollnagel and Amalberti's (2001) case, error coders were forced to disavow such claims to epistemological privilege (and embrace ontological relativism instead). They reclassified the errors as normal actions, rendering the score sheets virtually devoid of any error counts. Early transfers of aircraft were not an error, for example, but turned out to correspond to a deliberate strategy connected to a controller's foresight, planning ahead, and workload management. Rather than an expression of weakness, such strategies uncovered sources of robustness that would never have come out, or would even have been misrepresented and mischaracterized, with just the data in the classification tool. Such normalization of 54 CHAPTER 3 actions, which at first appear deviant from the outside, is a critical aspect to really understanding human work and its strengths and weaknesses (see Vaughan, 1996). Without understanding such processes of normalization, it is impossible to penetrate the situated meaning of errors or violations. Classification of errors crumbles on the inherent weakness of the naïve realism that underlies it. The realist idea is that errors are "out there," that they exist and can be observed, captured, and documented independently of the observer. This would mean that it makes no difference who does the observing (which it patently does). Such presumed realism is naive because all observations are ideational—influenced (or made possible in the first place) to a greater or lesser extent by who is doing the observing and by the worldview governing those observations. Realism does not work because it is impossible to separate the observer from the observed. Acknowledging some of these problems, the International Civil Aviation Organization (ICAO, 1998) has called for the development of human performance datacollection methods that do not rely on subjective assessments. But is this possible? Is there such a thing as an objective observation of another human's behavior?
  • 40. The Presumed Reality of Error The test of the air-traffic control error counting method reveals how "an action should not be classified as an 'error' only based on how it appears to an observer" (Hollnagel & Amalberti, 2001, p. 13). The test confirms ontological relativism. Yet sometimes the observed "error" should be entirely non-controversial, should it not? Take the spoiler example from chapter 1. The flight crew forgot to arm the spoilers. They made a mistake. It was an error. You can apply the new view to human error, and explain all about context and situation and mitigating factors. Explain why they did not arm the spoilers, but that they did not arm the spoilers is a fact. The error occurred. Even multiple different observers would agree on that. The flight crew failed to arm the spoilers. How can one not acknowledge the existence of that error? It is there, it is a fact, staring us in the face. But what is a fact? Facts always privilege the ruling paradigm. Facts always favor current interpretations, as they fold into existing constructed renderings of what is going on. Facts actually exist by virtue of the current paradigm. They can neither be discovered nor given meaning without it. There is no such thing as observations without a paradigm; research in the absence of a particular worldview is impossible. In the words of Paul Feyerabend (1993, p. 11): "On closer analysis, we even find that science knows no 'bare facts' at all, but that the 'facts' that enter our knowledge are already viewed in a certain way and are, therefore, essentially ideational." Feyera- 55 DOCTORS AND GUN OWNERS bend called the idea that facts are available independently and can thereby objectively favor one theory over another, the autonomy principle (p. 26). The autonomy principle asserts that the facts that are available as empirical content of one theory (e.g., procedural errors as facts that fit the threat and error model) are objectively available to alternative theories too. But this does not work. As the spoiler example from chapter 1 showed, errors occur against and because of a background, in this case a background so systemic, so structural, that the original human error pales against it. The error almost becomes transparent, it is normalized, it becomes invisible. Against this backdrop, this context of procedures, timing, engineering trade-offs, and weakened hydraulic systems, the omission to arm the spoilers dissolves. Figure and ground trade places: No longer is it the error that is really observable or even at all interesting. With deeper investigation, ground becomes figure. The backdrop begins to take precedence as the actual story, subsuming, swallowing the original error. No longer can the error be distinguished as a singular, failed decision moment. Somebody who applies a theory of naturalistic decision making will not see a procedure error. What will be seen instead is a continuous flow of actions and assessments, coupled and mutually cued, a flow with nonlinear feedback loops and interactions, inextricably embedded in a multilayered evolving context. Human interaction with a system, in other words, is seen as a continuous control task. Such a characterization is hostile to the digitization necessary to fish out individual human errors. Whether individual errors can be seen depends on the theory used. There are no objective observations of facts. Observers in error counting are themselves participants, participating in the very creation of the observed fact, and not just because they are there, looking at how other people are working. Of course, through their sheer presence, error counters probably distort people's normal practice, perhaps turning situated performance into a mere window-dressed posture. More fundamentally, however, observers in error counting are participants, because the facts they see would not exist without them. They are created through the method. Observers are participants because it is impossible to separate observer and object. None of this, by the way, makes the procedure error less real to those who observe it. This is the whole point of ontological relativism. But it does mean that the autonomy principle is false. Facts are not stable aspects of an independent reality, revealed to scientists who wield the right instruments and methods. The discovery and description of every fact is dependent on a particular theory. In the words of Einstein, it is the theory that determines what can be seen. Facts are not available "out there," independent of theory. To suppose that a better theory should come along to account for procedure errors in a way that more closely matches reality is to stick with a 56 CHAPTER 3 model of scientific progress that was disproved long ago. It follows the idea that theories should not be dismissed until there are compelling reasons to do so, and compelling reasons arise only because there is an overwhelming number of facts that disagree with the theory. Scientific work, in this idea, is the clean confrontation of observed fact with theory. But this is not how it works, for those facts do not exist without the theory.
  • 41. Resisting Change: The Theory Is Right. Or Is It? The idea of scientific (read: theoretical) progress through the accumulation of observed disagreeing facts that ultimately manage to topple a theory also does not work because counterinstances (i.e., facts that disagree with the theory) are not seen as such. Instead, if observations reveal counterinstances (such as errors that resist unique classification in any of the categories of the error-counting method), then researchers tend to see these as further puzzles in the match between observation and theory (Kuhn, 1962)—puzzles that can be addressed by further refinement of their method. Counterinstances, in other words, are not seen as speaking against the theory. According to Kuhn (1962), one of the defining responses to paradigmatic crisis is that scientists do not treat anomalies as counterinstances, even though that is what they are. It is extremely difficult for people to renounce the paradigm that has led them into a crisis. Instead, the epistemological difficulties suffered by error-counting methods (Was this a cause or a consequence? Was this a procedural or a proficiency error?) are dismissed as minor irritants and reasons to engage in yet more methodological refinement consonant with the current paradigm. Neither scientists nor their supporting communities in industry are willing to forego a paradigm until and unless there is a viable alternative ready to take its place. This is among the most sustained arguments surrounding the continuation of error counting: Researchers engaging in error classification are willing to acknowledge that what they do is not perfect, but vow to keep going until shown something better. And industry concurs. As Kuhn pointed out, the decision to reject one paradigm necessarily coincides with the acceptance of another. Proposing a viable alternative theory that can deal with its own facts, however, is exceedingly difficult, and has proven to be so even historically (Feyerabend, 1993). Facts, after all, privilege the status quo. Galileo's telescopic observations of the sky generated observations that motivated an alternative explanation about the place of the earth in the universe. His observations favored the Copernican heliocentric interpretation (where the earth goes around the sun) over the Ptolomeic geocentric one (where the 57 DOCTORS AND GUN OWNERS sun goes around the earth). The Copernican interpretation, however, was a worldview away from what was currently accepted, and many doubted Galileo's data as a valid empirical window on that heliocentric reality. People were highly suspicious of the new instrument: Some asked Galileo to open up his telescope to prove that there was no little moon hiding inside of it. How, otherwise, could the moon or any other celestial body be seen so closely if it was not itself hiding in the telescope? One problem was that Galileo did not offer a theoretical explanation for why this could be so, and why the telescope was supposed to offer a better picture of the sky than the naked eye. He could not, because relevant theories (optica) were not yet well developed. Generating better data (like Galileo did), and developing entirely new methods for better access to these data (such as a telescope), does in itself little to dislodge an established theory that allows people to see the phenomenon with their naked eye and explain it with their common sense. Similarly, people see the error happen with their naked eye, even without the help of an error-classification method: The pilot fails to arm the spoilers. Even their common sense confirms that this is an error. The sun goes around the earth. The earth is fixed. The Church was right, and Galileo was wrong. None of his observed facts could prove him right, because there was no coherent set of theories ready to accommodate his facts and give them meaning. The Church was right, as it had all the facts. And it had the theory to deal with them. Interestingly, the Church kept closer to reason as it was defined at the time. It considered the social, political, and ethical implications of Galileo's alternatives and deemed them too risky to accept—certainly on the grounds of tentative, rickety evidence. Disavowing the geocentric idea would be disavowing Creation itself, removing the common ontological denominator of the past millennium and severely undermining the authority and political power the Church derived from it. Error-classification methods too, guard a piece of rationality that most people in industry and elsewhere would be loathe to see disintegrate. Errors occur, they can be distinguished objectively. Errors can be an indication of unsafe performance. There is good performance and bad performance; there are identifiable causes for why people perform well or less well and for why failures happen. Without such a supposedly factual basis, without such hopes of an objective rationality, traditional and well-established ways for dealing with threats to safety and trying to create progress could collapse. Cartesian anxiety would grip the industry and research community. How can we hold people accountable for mistakes if there are no "errors"? How can we report safety occurrences and maintain expensive incident-reporting schemes if there are no errors? What can we fix if there are no causes for adverse events? Such questions fit a broader class of appeals against relativism. Postmod-58 CHAPTER 3 ernism and relativism, according to their detractors, can lead only to moral ambiguity, nihilism and lack of structural progress. We should instead hold onto the realist status quo, and we can, for most observed facts still seem to privilege it.
  • 42. Errors exist. They have to. To the naive realist, the argument that errors exist is not only natural and necessary, it is also quite impeccable, quite forceful. The idea that errors do not exist, in contrast, is unnatural, even absurd. Those within the established paradigm will challenge the sheer legitimacy of questions raised about the existence of errors, and by implication even the legitimacy of those who raise the questions: "Indeed, there are some psychologists who would deny the existence of errors altogether. We will not pursue that doubtful line of argument here" (Reason & Hobbs, 2003, p. 39). Because the current paradigm judges it absurd and unnatural, the question about whether errors exist is not worth pursuing: It is doubtful and unscientific and in the strictest sense (when scientific pursuits are measured and defined within the ruling paradigm), that is precisely what it is. If some scientists do not succeed in bringing statement and fact into closer agreement (they do not see a procedure error where others would), then this discredits the scientist rather than the theory. Galileo suffered from this too. It was the scientist who was discredited (for a while at least), not the prevailing paradigm. So what does he do? How does Galileo proceed once he introduces an interpretation so unnatural, so absurd, so countercultural, so revolutionary? What does he do when he notices that even the facts are not (interpreted to be) on his side? As Feyerabend (1993) masterfully described it, Galileo engaged in propaganda and psychological trickery. Through imaginary conversations between Sagredo, Salviati, and Simplicio, written in his native tongue rather than in Latin, he put the ontological uncertainty and epistemological difficulty of the geocentric interpretation on full display. The sheer logic of the geocentric interpretation fell apart whereas that of the heliocentric interpretation triumphed. Where the appeal to empirical facts failed (because those facts will still be forced to fit the prevailing paradigm rather than its alternative), an appeal to logic may still succeed. The same is true for error counting and classification. Just imagine this conversation: Simplicio: Errors result from physiological and psychological limitations of humans. Causes of error include fatigue, workload, and fear, as well as cognitive overload, poor interpersonal communications, imperfect information processing, and flawed decision making. Sagredo: But are errors in this case not simply the result of other errors? Flawed decision making would be an error. But in your logic, it causes an error. What is the error then? And how can we categorize it? 59 DOCTORS AND GUN OWNERS Simplicio: Well, but errors are caused by poor decisions, failures to adhere to brief, failures to prioritize attention, improper procedure, and so forth. Sagredo: This appears to be not causal explanation, but simply relabeling. Whether you say error, or poor decision, or failure to prioritize attention, it all still sounds like error, at least when interpreted in your worldview. And how can one be the cause of the other to the exclusion of the other way around? Can errors cause poor decisions just like poor decisions cause errors? There is nothing in your logic that rules this out, but then we end up with a tautology, not an explanation. And yet, such arguments may not help either. The appeal to logic may fail in the face of overwhelming support for a ruling paradigm—support that derives from consensus authority, from political, social, and organizational imperatives rather than a logical or empirical basis (which is, after all, pretty porous). Even Einstein expressed amazement at the common reflex to rely on measurements (e.g., error counts) rather than logic and argument: " 'Is it not really strange,' he asked in a letter to Max Born, 'that human beings are normally deaf to the strongest of argument while they are always inclined to overestimate measuring accuracies?' " (Feyerabend, 1993, p. 239). Numbers are strong. Arguments are weak. Error counting is good because it generates numbers, it relies on accurate measurements (recall Croft, 2001, who announced that "researchers" have "perfected" ways to monitor pilot performance), rather than on argument. In the end, no argument, none of this propaganda or psychological trickery can serve as a substitute for the development of alternative theory, nor did it in Galileo's case. The postmodernists are right and the realists are wrong: Without a paradigm, without a worldview, there are no facts. People will reject no theory on the basis of argument or logic alone. They need another to take its place. A paradigmatic interregnum would produce paralysis. Suspended in a theoretical vacuum, researchers would no longer be able to see facts or do anything meaningful with them. So, considering the evidence, what should the alternative theory look like? It needs to come with a superior explanation of performance variations, with an interpretation that is sensitive to the situatedness of the performance it attempts to capture. Such a theory sees no errors, but rather performance variations—inherently neutral changes and adjustments in how people deal with complex, dynamic situations. This theory will resist coming in from the outside, it will avoid judging other people from a position external to how the situation looked to the subject inside of it. The outlines of such a theory are developed further in various places in this book. 60 CHAPTER 3
  • 43. SAFETY AS MORE THAN ABSENCE OF NEGATIVES First, though, another question: Why do people bother with error counts in the first place? What goals do they hope these empirical measures help them accomplish, and are there better ways to achieve those goals? A final aim of error counting is to help make progress on safety, but this puts the link between errors and safety on trial. Can the counting of negatives (e.g., these errors) say anything useful about safety? What does the quantity measured (errors) have to do with the quality managed (safety)? Error-classification methods assume a close mapping between these two, and assume that an absence or reduction of errors is synonymous with progress on safety. By treating safety as positivistically measurable, error counting may be breathing the scientific spirit of a bygone era. Human performance in the laboratory was once gauged by counting errors, and this is still done when researchers test limited, contrived task behavior in spartan settings. But how well does this export to natural settings where people carry out actual complex, dynamic, and interactive work, where determinants of good and bad outcomes are deeply confounded? It may not matter. The idea of a realist count is compelling to industry for the same reasons that any numerical performance measurement is. Managers get easily infatuated with "balanced scorecards" or other faddish figures of performance. Entire business models depend on quantifying performance results, so why not quantify safety? Error counting becomes yet another quantitative basis for managerial interventions. Pieces of data from the operation that have been excised and formalized away from their origin can be converted into graphs and bar charts that subsequently form the inspiration for interventions. This allows managers, and their airlines, to elaborate their idea of control over operational practice and its outcomes. Managerial control, however, exists only in the sense of purposefully formulating and trying to influence the intentions and actions of operational people (Angell & Straub, 1999). It is not the same as being in control of the consequences (by which safety ultimately gets measured industry-wide), because for that the real world is too complex and operational environments too stochastic (e.g., Snook, 2000). There is another tricky aspect of trying to create progress on safety through error counting and classification. This has to do with not taking context into regard when counting errors. Errors, according to realist interpretations, represent a kind of equivalent category of bad performance (e.g., a failure to meet one's objective or intention), no matter who commits the error or in what situation. Such an assumption has to exist, otherwise tabulation becomes untenable. One cannot (or should not) add apples and oranges, after all. If both apples and oranges are entered into the method (and, given that the autonomy principle is false, error-counting 61 DOCTORS AND GUN OWNERS methods do add apples and oranges), silly statistical tabulations that claim doctors are 7,500 times more dangerous than gun owners can roll out the other end. As Hollnagel and Amalberti (2001) showed, attempts to map situated human capabilities such as decision making, proficiency, or deliberation onto discrete categories are doomed to be misleading. They cannot cope with the complexity of actual practice without serious degeneration (Angell & Straub, 1999). Error classification disembodies data. It removes the context that helped produce the behavior in its particular manifestation. Such disembodiment may actually retard understanding. The local rationality principle (people's behavior is rational when viewed from the inside of their situations) is impossible to maintain when context is removed from the controversial action. And error categorization does just that: It removes context. Once the observation of some kind of error is tidily locked away into some category, it has been objectified, formalized away from the situation that brought it forth. Without context, there is no way to reestablish local rationality. And without local rationality, there is no way to understand human error. And without understanding human error, there may be no way to learn how to create progress on safety. Safety as Reflexive Project Safety is likely to be more than the measurement and management of negatives (errors), if it is that at all. Just as errors are epistemologically elusive (How do you know what you know? Did you really see a procedure error? Or was it a proficiency error?), and ontologically relativist (what it means "to be" and to perform well or badly inside a particular situation is different from person to person), the notion of safety may similarly lack an objective, common denominator. The idea behind measuring safety through error counts is that safety is some kind of objective, stable (and perhaps ideal) reality, a reality that can be measured and reflected, or represented, through method. But does this idea hold? Rochlin (1999, p. 1550), for example, proposed that safety is a "constructed human concept" and others in human factors have begun to probe how individual practitioners construct safety, by assessing what they understand risk to be, and how they perceive their ability of managing challenging situations (e.g., Orasanu, 2001).
  • 44. A substantial part of practitioners' construction of safety turns out to be reflexive, assessing the person's own competence or skill in maintaining safety across different situations. Interestingly, there may be a mismatch between risk salience (how critical a particular threat to safety was perceived to be by the practitioner) and frequency of encounters (how often these threats to safety are in fact met in practice). The safety threats deemed most salient were the ones least frequently dealt with (Orasanu, 2001). Safety is more akin to a reflexive project, sustained through a revisable narrative of 62 CHAPTER 3 self-identity that develops in the face of frequently and less frequently encountered risks. It is not something referential, not something that is objectively "out there" as a common denominator, open to any type of approximation by those with the best methods. Rather, safety may be reflexive: something that people relate to themselves. The numbers produced by error counts are a logical endpoint of a structural analysis that focuses on (supposed) causes and consequences, an analysis that defines risk and safety instrumentally, in terms of minimizing errors and presumably measurable consequences. A second, more recent approach is more socially and politically oriented, and places emphasis on representation, perception, and interpretation rather than on structural features (Rochlin, 1999). The managerially appealing numbers generated by error counts do not carry any of this reflexivity, none of the nuances of what it is to "be there," doing the work, creating safety on the line. What it is to be there ultimately determines safety (as outcome): People's local actions and assessments are shaped by their own perspectives. These in turn are embedded in histories, rituals, interactions, beliefs and myths, both of people's organization and organizational subculture and of them as individuals. This would explain why good, objective, empirical indicators of social and organizational definitions of safety are difficult to obtain. Operators of reliable systems "were expressing their evaluation of a positive state mediated by human action, and that evaluation reflexively became part of the state of safety they were describing" (Rochlin, 1999, p. 1550). In other words, the description itself of what safety means to an individual operator is a part of that very safety, dynamic and subjective. "Safety is in some sense a story a group or organization tells about itself and its relation to its task environment" (Rochlin, p. 1555). Can We Measure Safety? But how does an organization capture what groups tell about themselves; how does it pin down these stories? How can management measure a mediated, reflexive idea? If not through error counts, what can an organization look for in order to get some measure of how safe it is? Large recent accidents provide some clues of where to start looking (e.g., Woods, 2003). A main source of residual risk in otherwise safe transportation systems is the drift into failure described in chapter 2. Pressures of scarcity and competition narrow an organization's focus on goals associated with production. With an accumulating base of empirical success (i.e., no accidents, even if safety is increasingly traded off against other goals such as maximizing profit or capacity utilization), the organization, through its members' multiple little and larger daily decisions, will begin to believe that past success is 63 DOCTORS AND GUN OWNERS a guarantee of future safety, that historical success is a reason for confidence that the same behavior will lead to the same (successful) outcome the next time around. The absence of failure, in other words, is taken as evidence that hazards are not present, that countermeasures already in place are effective. Such a model of risk is embedded deeply in the reflexive stories of safety that Rochlin (1999) talked about, and it can be made explicit only through qualitative investigations that probe the interpretative aspect of situated human assessments and actions. Error counts do little to elucidate any of this. More qualitative studies could reveal how currently traded models of risk may increasingly be at odds with the actual nature and proximity of hazard, though it may of course be difficult to establish the objective, or ontologically absolutist, presence of hazard. Particular aspects of how organization members tell or evaluate safety stories, however, can serve as markers. Woods (2003, p. 5), for example, has called one of these markers "distancing through differencing." In this process, organizational members look at other failures and other organizations as not relevant to them and their situation. They discard other events because they appear at the surface to be dissimilar or distant. Discovering this through qualitative inquiry can help specify how people and organizations reflexively create their idea, their story of safety. Just because the organization or section has different technical problems, different managers, different histories, or can claim to already have addressed a particular safety concern revealed by the event, does not mean that they are immune to the problem. Seemingly divergent events can represent similar underlying patterns in the drift towards hazard.
  • 45. High-reliability organizations characterize themselves through their preoccupation with failure: continually asking themselves how things can go wrong and could have gone wrong, rather than congratulating themselves on the fact that things went right. Distancing through differencing means underplaying this preoccupation It is one way to prevent learning from events elsewhere, one way to throw up obstacles in the flow of safety-related information. Additional processes that can be discovered include to what extent an organization resists oversimplifying interpretations of operational data, whether it defers to expertise and expert judgment rather than managerial imperatives. Also, it could be interesting to probe to what extent problem-solving processes are fragmented across organizational departments, sections, or subcontractors. The 1996 Valujet accident, where flammable oxygen generators were placed in an aircraft cargo hold without shipping caps, subsequently burning down the aircraft, was related to a web of subcontractors that together made up the virtual airline of Valujet. Hundreds of people within even one subcontractor logged work against the particular Valujet aircraft, and this subcontractor was only one of many players in a network of organizations and companies tasked with different aspects 64 CHAPTER 3 of running (even constituting) the airline. Relevant maintenance parts (among them the shipping caps) were not available at the subcontractor, ideas of what to do with expired oxygen canisters were generated ad hoc in the absence of central guidance, and local understandings for why shipping caps may have been necessary were foggy at best. With work and responsibility for it distributed among so many participants, nobody may have been able anymore to see the big picture, including the regulator. Nobody may have been able to recognize the gradual erosion of safety constraints on the design and operation of the original system. If safety is a reflexive project rather than an objective datum, human factors researchers must develop entirely new probes for measuring the safety health of an organization. Error counts do not suffice. They uphold an illusion of rationality and control, but may offer neither real insight nor productive routes for progress on safety. It is, of course, a matter of debate whether the vaguely defined organizational processes that could be part of new safety probes (e.g., distancing through differencing, deference to expertise, fragmentation of problem-solving, incremental judgments into disaster) are any more real than the errors from the counting methods they seek to replace or augment. But then, the reality of these phenomena is in the eye of the beholder: Observer and observed cannot be separated; object and subject are largely indistinguishable. The processes and phenomena are real enough to those who look for them and who wield the theories to accommodate the results. Criteria for success may lie elsewhere, for example in how well the measure maps onto past evidence of precursors to failure. Yet even such mappings are subject to paradigmatic interpretations of the evidence base. Indeed, consonant with the ontological relativity of the age human factors has now entered, the debate can probably never be closed. Are doctors more dangerous than gun owners? Do errors exist? It depends on who you ask. The real issue, therefore, lies a step away from the fray, a level up, if you will. Whether we count errors as Durkheimian fact on the one hand or see safety as a reflexive project on the other, competing premises and practices reflect particular models of risk. These models of risk are interesting not because of their differential abilities to access empirical truth (because that may all be relative), but because of what they say about us, about human factors and system safety. It is not the monitoring of safety that we should simply pursue, but the monitoring of that monitoring. If we want to make progress on safety, one important step is to engage in such metamonitoring, to become better aware of the models of risk embodied in our assumptions and approaches to safety. 4 Chapter Don't Errors Exist? Human factors as a discipline takes a very realist view. It lives in a world of real things, of facts and concrete observations. It presumes the existence of an external world in which phenomena occur that can be captured and described objectively. In this world there are errors and violations, and these errors and violations are quite real. The flight-deck observer from chapter 3, for example, would see that pilots do not arm the spoilers before landing and marks this up as an error or a procedural violation. The observer considers his observation quite true, and the error quite real. Upon discovering that the spoilers had not been armed, the pilots themselves too may see their omission as an error, as something that they missed but should not have missed. But just as it did for the flight-deck observer, the error becomes real only because it is visible from outside the stream of experience. From the inside of this stream, while things are going on and work is being accomplished, there is no error. In this case there are only procedures that get inadvertently mangled through the timing and sequence of various tasks. And not even this gets noticed by those applying the procedures.
  • 46. Recall how Feyerabend (1993) pointed out that all observations are ideational, that facts do not exist without an observer wielding a particular theory that tells him or her what to look for. Observers are not passive recipients, but active creators of the empirical reality they encounter. There is no clear separation between observer and observed. As said in chapter 3, none of this makes the error any less real to those who observe it. But it does not mean that the error exists out there, in some independent empirical universe. This was the whole point of ontological relativism: What it means to be in a particular situation and make certain observations is quite flexible and connected systematically to the observer. None of the possible worldviews can be judged superior or privileged uniquely by empirical data about the world, because objective, impartial access to that world is impossible. Yet in the pragmatic and optimistically realist spirit of human factors, error counting methods have gained popularity by selling the belief that such impartial access is possible. The claim to privileged access lies (as modernism and Newtonian science would dictate) in method. The method is strong enough to discover errors that the pilots themselves had not seen. Errors appear so real when we step or set ourselves outside the stream of experience in which they occur. They appear so real to an observer sitting behind the pilots. They appear so real to even the pilot himself after the fact. But why? It cannot be because the errors are real, since the autonomy principle has been proven false. As an observed fact, the error only exists by virtue of the observer and his or her position on the outside of the stream of experience. The error does not exist because of some objective empirical reality in which it putatively takes place, since there is no such thing and if there was, we could not know it. Recall the air-traffic control test of chapter 3: Actions, omissions, and postponements related to air-traffic clearances carry entirely different meanings for those on the inside and on the outside of the work experience. Even different observers on the outside cannot agree on a common denominator because they have diverging backgrounds and conceptual looking glasses. The autonomy principle is false: facts do not exist without an observer. So why do errors appear so real? ERRORS ARE ACTIVE, CORRECTIVE INTERVENTIONS IN HISTORY Errors are an active, corrective intervention in (immediate) history. It is impossible for us to give a mere chronicle of our experiences: Our assumptions, past experiences and future aspirations cause us to impress a certain organization on that which we just went through or saw. Errors are a powerful way to impose structure onto past events. Errors are a particular way in which we as observers (or even participants) reconstruct the reality we just experienced. Such reconstruction, however, inserts a severe discontinuity between past and present. The present was once an uncertain, perhaps vanishingly improbable, future. Now we see it as the only plausible outcome of a pretty deterministic past. Being able to stand outside an unfolding sequence of events (either as participants from hindsight or as observers from outside the setting) makes it exceedingly difficult to see how unsure we once were (or could have been if we had been in that situation) of what was going to happen. History as seen through the eyes of a retrospective out-sider (even if the same observer was a participant in that history not long ago) is substantially different from the world as it appeared to the decision makers of the day. This endows history, even immediate history, with a determinism it lacked when it was still unfolding. Errors, then, are expost facto constructs. The research base on the hindsight bias contains some of the strongest evidence on this. Errors are not empirical facts. They are the result of outside observers squeezing nowknown events into the most plausible or convenient deterministic scheme. In the research base on hindsight, it is not difficult to see how such retrospective restructuring embraces a liberal take on the history it aims to recount. The distance between reality as portrayed by a retrospective observer and as experienced by those who were there (even if these were once the same people) grows substantially with the rhetoric and discourse employed and the investigative practices used. We see a lot of this later in the discussion. We also look at developments in psychology that have (since not so long ago) tried to get away from the normativist bias in our understanding of human performance and decision making. This intermezzo is necessary because errors and violations do not exist without some norm, even if implied. Hindsight of course has a powerful way of importing criteria or norms from outside people's situated contexts, and highlighting where actual performance at the time fell short. To see errors as expost constructs rather than as objective, observed facts, we have to understand the influence of implicit norms on our judgments of past performance. Doing without errors means doing with normativism. It means that we cannot question the accuracy of insider accounts (something human factors consistently does, e.g., when it asserts a "loss of situation awareness"), as there is no objective, normative reality to hold such accounts up to, and relative to which we can deem them accurate or inaccurate.
  • 47. Reality as experienced by people at the time was reality as it was experienced by them at the time, full stop. It was that experienced world that drove their assessments and decisions, not our (or even their) retrospective, outsider rendering of that experience. We have to use local norms of competent performance to understand why what people did made sense to them at the time. Finally, an important question we must look ahead to: Why is it that errors fulfill such an important function in our reconstructions of history, of even our own histories? Seeing errors in history may actually have little to do with historical explanation. Rather, it may be about controlling the future. What we see toward the end of this chapter is that the hindsight bias may not at all be about history, and may not even be a bias. Retrospective reconstruction, and the hindsight bias, should not be seen as the primary phenomenon. Rather, it represents and serves a larger purpose, answering a highly pragmatic concern. The almost inevitable urge to highlight past choice moments (where people went the wrong way), the drive to identify errors, is forward looking, not backward looking. The hindsight bias may not be a bias because it is an adaptive response, an oversimplification of history that primes us for complex futures and allows us to project simple models of past lessons onto those futures, lest history repeat itself. This means that retrospective recounting tells us much more about the observer than it does about reality—if there is such an objective thing. Making Tangled Histories Linear The hindsight bias (Fischoff, 1975) is one of the most consistent biases in psychology. One effect is that "people who know the outcome of a complex prior history of tangled, indeterminate events, remember that history as being much more determinant, leading 'inevitably' to the outcome they already knew" (Weick, 1995, p. 28). Hindsight allows us to change past indeterminacy and complexity into order, structure, and oversimplified causality (Reason, 1990). As an example, take the turn towards the mountains that a Boeing 757 made just before an accident near Cali, Colombia in 1995. According to the investigation, the crew did not notice the turn, at least not in time (Aeronautica Civil, 1996). What should the crew have seen in order to know about the turn? They had plenty of indications, according to the manufacturer of their aircraft: Indications that the airplane was in a left turn would have included the following: the EHSI (Electronic Horizontal Situation Indicator) Map Display (if selected) with a curved path leading away from the intended direction of flight; the EHSI VOR display, with the CDI (Course Deviation Indicator) displaced to the right, indicating the airplane was left of the direct Cali VOR course, the EaDI indicating approximately 16 degrees of bank, and all heading indicators moving to the right. Additionally the crew may have tuned Rozo in the ADF and may have had bearing pointer information to Rozo NDB on the RMDI. (Boeing Commercial Airplane Group, 1996, p. 13) This is a standard response after mishaps: Point to the data that would have revealed the true nature of the situation. In hindsight, there is an overwhelming array of evidence that did point to the real nature of the situation, and if only people had paid attention to even some of it, the outcome would have been different. Confronted with a litany of indications that could have prevented the accident, we wonder how people at the time could not have known all of this We wonder how this "epiphany" was missed, why this bloated shopping bag full of revelations was never opened by the people who most needed it. But knowledge of the critical data comes only with the omniscience of hindsight. We can only know what really was critical or highly relevant once we know the outcome. Yet if data can be shown to have been physically available, we often assume that it should have been picked up by the practitioners in the situation. The problem is that pointing out that something should have been noticed does not explain why it was not noticed, or why it was interpreted differently back then. This confusion has to do with us, not with the people we are investigating. What we, in our reaction to failure, fail to appreciate is that there is a dissociation between data availability and data observability—between what can be shown to have been physically available and what would have been observable given people's multiple interleaving tasks, goals, attentional focus, expectations, and interests. Data, such as the litany of indications in the previous example, do not reveal themselves to practitioners in one big monolithic moment of truth. In situations where people do real work, data can get drip-fed into the operation: a little bit here, a little bit there. Data emerges over time. Data may be uncertain. Data may be ambiguous. People have other things to do too. Sometimes the successive or multiple data bits are contradictory, often they are unremarkable. It is one thing to say how we find some of these data important in hindsight. It is quite another to understand what the data meant, if anything, to the people in question at the time. The same kind of confusion occurs when we, in hindsight, get an impression that certain assessments and actions point to a common condition. This may be true at first sight. In trying to make sense of past performance, it is always tempting to group individual fragments of human performance that seem to share something, that seem to be connected in some way, and connected to the eventual outcome. For example, "hurry" to land was such a leitmotif extracted from the evidence in the Cali investigation.
  • 48. Haste in turn is enlisted to explain the errors that were made: Investigators were able to identify a series of errors that initiated with the flightcrew's acceptance of the controller's offer to land on runway 19 ... The CVR (Cockpit Voice Recorder) indicates that the decision to accept the offer to land on runway 19 was made jointly by the captain and the first officer in a 4-second exchange that began at 2136:38. The captain asked: "would you like to shoot the one nine straight in?" The first officer responded, 'Yeah, we'll have to scramble to get down. We can do it." This interchange followed an earlier discussion in which the captain indicated to the first officer his desire to hurry the arrival into Cali, following the delay on departure from Miami, in an apparent to minimize the effect of the delay on the flight attendants' rest requirements. For example, at 2126:01, he asked the first officer to "keep the speed up in the descent" . . . (This is) evidence of the hurried nature of the tasks performed. (Aeronautica Civil, 1996, p. 29) In this case the fragments used to build the argument of haste come from over half an hour of extended performance. Outside observers have treated the record as if it were a public quarry to pick stones from, and the accident explanation the building he needs to erect. The problem is that each fragment is meaningless outside the context that produced it: Each fragment has its own story, background, and reasons for being, and when it was produced it may have had nothing to do with the other fragments it is now grouped with. Moreover, behavior takes place in between the fragments. These intermediary episodes contain changes and evolutions in perceptions and assessments that separate the excised fragments not only in time, but also in meaning. Thus, the condition, and the constructed linearity in the story that binds these performance fragments, does not arise from the circumstances that brought each of the fragments forth; it is not a feature of those circumstances. It is an artifact of the outside observer. In the case just described, hurry is a condition identified in hindsight, one that plausibly couples the start of the flight (almost 2 hours behind schedule) with its fatal ending (on a mountainside rather than an airport). Hurry is a retrospectively invoked leitmotif that guides the search for evidence about itself. It leaves the investigator with a story that is admittedly more linear and plausible and less messy and complex than the actual events. Yet it is not a set of findings, but of tautologies. Counterfactual Reasoning Tracing the sequence of events back from the outcome—that we as outside observers already know about— we invariably come across joints where people had opportunities to revise their assessment of the situation but failed to do so, where people were given the option to recover from their route to trouble, but did not take it. These are counterfactuals—quite common in accident analysis. For example, "The airplane could have overcome the windshear encounter if the pitch attitude of 15 degrees nose-up had been maintained, the thrust had been set to 1.93 EPR (Engine Pressure Ratio) and the landing gear had been retracted on schedule" (NTSB, 1995, p. 119). Counterfactuals prove what could have happened if certain minute and often Utopian conditions had been met. Counterfactual reasoning may be a fruitful exercise when trying to uncover potential countermeasures against such failures in the future. But saying what people could have done in order to prevent a particular outcome does not explain why they did what they did. This is the problem with counterfactuals. When they are enlisted as explanatory proxy, they help circumvent the hard problem of investigations: finding out why people did what they did. Stressing what was not done (but if it had been done, the accident would not have happened) explains nothing about what actually happened, or why. In addition, counterfactuals are a powerful tributary to the hindsight bias. They help us impose structure and linearity on tangled prior histories. Counterfactuals can convert a mass of indeterminate actions and events, themselves overlapping and interacting, into a linear series of straightforward bifurcations. For example, people could have perfectly executed the go-around maneuver but did not; they could have denied the runway change but did not. As the sequence of events rolls back into time, away from its outcome, the story builds. We notice that people chose the wrong prong at each fork, time and again—ferrying them along inevitably to the outcome that formed the starting point of our investigation (for without it, there would have been no investigation). But human work in complex, dynamic worlds is seldom about simple dichotomous choices (as in: to err or not to err). Bifurcations are extremely rare especially those that yield clear previews of the respective outcomes at each end. In reality, choice moments (such as there are) typically reveal multiple possible pathways that stretch out, like cracks in a window, into the ever denser fog of futures not yet known. Their outcomes are indeterminate, hidden in what is still to come. In reality, actions need to be taken under uncertainty and under the pressure of limited time and other resources.
  • 49. What from the retrospective outside may look like a discrete, leisurely two-choice opportunity to not fail, is from the inside really just one fragment caught up in a stream of surrounding actions and assessments. In fact, from the inside it may not look like a choice at all. These are often choices only in hindsight. To the people caught up in the sequence of events, there was perhaps not any compelling reason to reassess their situation or decide against anything (or else they probably would have) at the point the investigator has now found significant or controversial. They were likely doing what they were doing because they thought they were right, given their understanding of the situation, their pressures. The challenge for an investigator becomes to understand how this may not have been a discrete event to the people whose actions are under investigation. The investigator needs to see how other people's decisions to continue were likely nothing more than continuous behavior—reinforced by their current understanding of the situation, confirmed by the cues they were focusing on, and reaffirmed by their expectations of how things would develop. Judging Instead of Explaining When outside observers use counterfactuals, even as explanatory proxy, they themselves often require explanations as well. After all, if an exit from the route to trouble stands out so clearly to outside observers, how was it possible for other people to miss it? If there was an opportunity to recover, to not crash, then failing to grab it demands an explanation. The place where observers often look for clarification is the set of rules, professional standards, and available data that surrounded people's operation at the time, and how people did not see or meet that which they should have seen or met. Recognizing that there is a mismatch between what was done or seen and what should have been done or seen as per those standards we easily judge people for not doing what they should have done. Where fragments of behavior are contrasted with written guidance that can be found to have been applicable in hindsight, actual performance is often found wanting; it does not live up to procedures or regulations. For example, "One of the pilots ... executed [a computer entry] without having verified that it was the correct selection and without having first obtained approval of the other pilot, contrary to procedures" (Aeronautica Civil, 1996, p.31). Investigations invest considerably in organizational archeology so that they can construct the regulatory or procedural framework within which the operations took place, or should have taken place. Inconsistencies between existing procedures or regulations and actual behavior are easy to expose when organizational records are excavated after the fact and rules uncovered that would have fit this or that particular situation.This is not, however, very informative. There is virtually always a mismatch between actual behavior and written guidance that can be located in hindsight. Pointing out a mismatch sheds little light on the why of the behavior in question, and, for that matter, mismatches between procedures and practice are not unique to mishaps. There are also less obvious or undocumented standards. These are often invoked when a controversial fragment (e.g., a decision to accept a runway change, Aeronautica Civil, 1996, or the decision to go around or not, NTSB, 1995) knows no clear preordained guidance but relies on local, situated judgment. For these cases there are always supposed standards of good practice, based on convention and putatively practiced across an entire industry. One such standard in aviation is "good airmanship," which, if nothing else can, will explain the variance in behavior that had not yet been accounted for. When micromatching, observers frame people's past assessments and actions inside a world that they have invoked retrospectively. Looking at the frame as overlay on the sequence of events, they see that pieces of behavior stick out in various places and at various angles: a rule not followed here, available data not observed there, professional standards not met over there. But rather than explaining controversial fragments in relation to the circumstances that brought them forth, and in relation to the stream of preceding as well as succeeding behaviors that surrounded them, the frame merely boxes performance fragments inside a world observers now know to be true. The problem is that this after-the-fact-world may have very little relevance to the actual world that produced the behavior under study. The behavior is contrasted against the observer's reality, not the reality surrounding the behavior at the time. Judging people for what they did not do relative to some rule or standard does not explain why they did what they did. Saying that people failed to take this or that pathway—only in hindsight the right one judges other people from a position of broader insight and outcome knowledge that they themselves did not have. It does not explain a thing yet; it does not shed any light on why people did what they did given their surrounding circumstances. Outside observers have become caught in what William James called the "psychologist's fallacy" a century ago: They have substituted their own reality for the one of their object of study. The More We Know, the Less We Understand We actually have interesting expectations of new technology in this regard. Technology has made it increasingly easy to capture and record the reality that surrounded other people carrying out work. In commercial aviation, the electronic footprint that any flight produces is potentially huge.
  • 50. We can use these data to reconstruct the world as it must have been experienced by other people back then, potentially avoiding the psychologist's fallacy. But capturing such data addresses only one side of the problem. Our ability to make sense of these data, to employ them in a reconstruction of the sensemaking processes of other people at another time and place, has not kept pace with our growing technical ability to register traces of their behavior. In other words, the presumed dominance of human factors in incidents and accidents is not matched by our ability to analyze or understand the human contribution for what it is worth. Data used in accident analysis often come from a recording of human voices and perhaps other sounds (ruffling charts, turning knobs), which can be coupled to a greater or lesser extent with contemporaneous system or process behavior. A voice trace, however, represents only a partial data record. Human behavior in rich, unfolding settings is much more than the voice trace it leaves behind. The voice trace always points beyond itself, to a world that was unfolding around the practitioners at the time, to tasks, goals, perceptions, intentions, thoughts, and actions that have since evaporated. But most investigations are formally restricted in how they can couple the cockpit voice recording to the world that was unfolding around the practitioners (e.g., instrument indications, automation-mode settings). In aviation, for example, International Civil Aviation Organization (ICAO Annex 13) prescribes how only those data that can be factually established may be analyzed in the search for cause. This provision often leaves the cockpit voice recording as only a factual, decontextualized, and impoverished footprint of human performance. Making connections between the voice trace and the circumstances and people in which it was grounded quickly falls outside the pale of official analysis and into the realm of what many would call inference or speculation. This inability to make clear connections between behavior and world straightjackets any study of the human contribution to a cognitively noisy, evolving sequence of events. ICAO Annex 13 thus regulates the disembodiment of data: Data must be studied away from their context, for the context and the connections to it are judged as too tentative, too abstract, too unreliable. Such a provision, contradicted by virtually all cognitive psychological research, is devastating to our ability to make sense of puzzling performance. Apart from the provisions of ICAO Annex 13, this problem is complicated by the fact that current flight-data recorders (FDRs) often do not capture many automation-related traces: precisely those data that are of immediate importance to understanding the problem-solving environment in which most pilots today carry out their work. For example, FDRs in many highly automated aircraft do not record which ground-based navigation beacons were selected by the pilots, which automation-mode control-panel selections on airspeed, heading, altitude, and vertical speed were made, or what was shown on both pilots' moving map displays. As operator work has shifted to the management and supervision of a suite of automated resources, and problems leading to accidents increasingly start in human-machine interactions, this represents a large gap in our ability to access the reasons for human assessments and actions in modern operational workplaces. INVERTING PERSPECTIVES Knowing about and guarding against the psychologist's fallacy, against the mixing of realities, is critical to understanding error. When looked at from the position of retrospective outsider, the error can look so very real, so compelling. They failed to notice, they did not know, they should have done this or that. But from the point of view of people inside the situation, as well as potential other observers, this same error is often nothing more than normal work. If we want to begin to understand why it made sense for people to do what they did, we have to reconstruct their local rationality. What did they know? What was their understanding of the situation? What were their multiple goals, resource constraints, pressures? Behavior is rational within situational contexts: People do not come to work to do a bad job. As historian Barbara Tuchman put it: "Every scripture is entitled to be read in the light of the circumstances that brought it forth. To understand the choices open to people of another time, one must limit oneself to what they knew; see the past in its own clothes, as itwere, notin ours" (1981, p. 75). This position turns the exigent social and operational context into the only legitimate interpretive device. This context becomes the constraint on what meaning we, who were not there when it happened, can now give to past controversial assessments and actions. Historians are not the only ones to encourage this switch, this inversion of perspectives, this persuasion to put ourselves in the shoes of other people. In hermeneutics it is known as the difference between exegesis (reading out of the text) and eisegesis (reading into the text). The point is to read out of the text what is has to offer about its time and place, not to read into the text what we want it to say or reveal now. Jens Rasmussen points out that if we cannot find a satisfactory answer to questions such as "how could they not have known?", then this is not because these people were behaving bizarrely. It is because we have chosen the wrong frame of reference for understanding their behavior (Vicente, 1999).
  • 51. The frame of reference for understanding people's behavior is their own normal, individual work context, the context they are embedded in and from whose point of view the decisions and assessments made are mostly normal, daily, unremarkable, perhaps even unnoticeable. A challenge is to understand how assessments and actions that from the outside look like errors become neutralized or normalized so that from the inside they appear unremarkable, routine, normal. If we want to understand why people did what they did, then the adequacy of the insider's representation of the situation cannot be called into question. The reason is that there are no objective features in the domain on which we can base such a judgment. In fact, as soon as we make such a judgment, we have imported criteria from the outside—from another time and place, from another rationality. Ethnographers have always championed the point of view of the person on the inside. Like Rasmussen, Emerson advised that, instead of using criteria from outside the setting to examine mistake and error, we should investigate and apply local notions of competent performance that are honored and used in particular social settings (Vaughan, 1999). This excludes generic rules and motherhoods (e.g., "pilots should be immune to commercial pressures"). Such putative standards ignore the subtle dynamics of localized skills and priority setting, and run roughshod over what would be considered "good" or "competent" or "normal" from inside actual situations. Indeed, such criteria impose a rationality from the outside, impressing a frame of context-insensitive, idealized concepts of practice upon a setting where locally tailored and subtly adjusted criteria rule instead. The ethnographic distinction between etic and emic perspectives was coined in the 1950s to capture the difference between how insiders view a setting and how outsiders view it. Emic originally referred to the language and categories used by people in the culture studied, whereas etic language and categories were those of outsiders (e.g., the ethnographer) based on their analysis of important distinctions. Today, emic is often understood to be the view of the world from the inside out, that is, how the world looks from the eyes of the person studied. The point of ethnography is to develop an insider's view of what is happening, an inside- out view. Etic is contrasted as the perspective from the outside in, where researchers or observers attempt to gain access to some portions of an insider's knowledge through psychological methods such as surveys or laboratory studies. Emic research considers the meaning-making activities of individual minds. It studies the multiple realities that people construct from their experiences with their empirical reality. It assumes that there is no direct access to a single, stable, and fully knowable external reality. Nobody has this access. Instead, all understanding of reality is contextually embedded and limited by the local rationality of the observer. Emic research points at the unique experience of each human, suggesting that any observer's way of making sense of the world is as valid as any other, and that there are no objective criteria by which this sensemaking can be judged correct or incorrect. Emic researchers resist distinguishing between objective features of a situation, and subjective ones. Such a distinction distracts the observer from the situation as it looked to the person on the inside, and in fact distorts this insider perspective. A fundamental concern is to capture and describe the point of view of people inside a system or situation, to make explicit that which insiders take for granted, see as common sense, find unremarkable or normal. When we want to understand error, we have to embrace ontological relativity not out of philosophical intransigence or philanthropy, but for trying to get the inside-out view. We have to do this for the sake of learning what makes a system safe or brittle. As we saw in chapter 2, for example, the notion of what constitutes an incident (i.e., what is worthy of reporting as a safety threat) is socially constructed, shaped by history, institutional constraints, cultural and linguistic notions. It is negotiated among insiders in the system. None of the structural measures an organization takes to put an incident reporting system in place will have any effect if insiders do not see safety threats as incidents that are worth sending into the reporting system. Nor will the organization ever really improve reporting rates if it does not understand the notion of incident (and, conversely, the notion of normal practice) from the point of view of the people who do it everyday. To succeed at this, outsiders need to take the inside-out look, they need to embrace ontological relativity, as only this can crack the code to system safety and brittleness. All the processes that set complex systems onto their drifting paths toward failure—the conversion of signals of danger into normal, expected problems, the incrementalist borrowing from safety, the assumption that past operational success is a guarantee of future safety—are sustained through implicit social-organizational consensus, driven by insider language and rationalizations. The internal workings of these processes are simply impervious to outside inspection, and thereby numb to external pressure for change. Outside observers cannot attain an emic perspective, nor can they study the multiple rationalities created by people on the inside if they keep seeing errors and violations.
  • 52. Outsiders can perhaps get some short-term leverage by (re) imposing context-insensitive rules, regulations, or exhortations and making moral appeals for people to follow them, but the effects are generally short-lived. Such measures cannot be supported by operational ecologies. There, actual practice is always under pressure to adapt in an open system, exposed to pressures of scarcity and competition. It will once again inevitably drift into niches that generate greater operational returns at no apparent cost to safety. ERROR AND (IR)RATIONALITY Understanding error against the background of local rationality, or rationality for that matter, has not been an automatic by-product of studying the psychology of error. In fact, research into human error had a very rationalist bias up to the 1970s (Reason, 1990), and in some quarters in psychology and human factors such rationalist partiality has never quite disappeared. Rationalist means that mental processes can be understood with reference to normative theories that describe optimal strategies. Strategies may be optimal when the decision maker has perfect, exhaustive access to all relevant information, takes time enough to consider it all, and applies clearly defined goals and preferences to making the final choice. In such cases, errors are explained by reference to deviations from this rational norm, this ideal. If the decision turns out wrong it may be because the decision maker did not take enough time to consider all information, or that he or she did not generate an exhaustive set of choice alternatives to pick from. Errors, in other words, are deviant. They are departures from a standard. Errors are irrational in the sense that they require a motivational (as opposed to cognitive) component in their explanation. If people did not take enough time to consider all information, it is because they could not be bothered to. They did not try hard enough, and they should try harder next time, perhaps with the help of some training or procedural guidance. Investigative practice in human factors is still rife with such rationalist reflexes. It did not take long for cognitive psychologists to find out how humans could not or should not even behave like perfectly rational decision makers. Whereas economists clung to the normative assumptions of decision making (decision makers have perfect and exhaustive access to information for their decisions, as well as clearly defined preferences and goals about what they want to achieve), psychology, with the help of artificial intelligence, posited that there is no such thing as perfect rationality (i.e., full knowledge of all relevant information, possible outcomes, relevant goals), because there is not a single cognitive system in the world (neither human nor machine) that has sufficient computational capacity to deal with it all. Rationality is bounded. Psychology subsequently started to chart people's imperfect, bounded, or local rationality. Reasoning, it discovered, is governed by people's local understanding, by their focus of attention, goals, and knowledge, rather than some global ideal. Human performance is embedded in, and systematically connected to, the situation in which it takes place: It can be understood (i.e., makes sense) with reference to that situational context, not by reference to some universal standard. Human actions and assessments can be described meaningfully only in reference to the localized setting in which they are made; they can be understood by intimately linking them to details of the context that produced and accompanied them. Such research has given rationality an interpretive flexibility: What is locally rational does not need to be globally rational. If a decision is locally rational, it makes sense from the point of view of the decision maker, which is what matters if we want to learn about the underlying reasons for what from the outside looks like error. The notion of local rationality removes the need to rely on irrational explanations of error. Errors make sense: They are rational, if only locally so, when seen from the inside of the situation in which they were made. But psychologists themselves often have trouble with this. They keep on discovering biases and aberrations in decision making (e.g., groupthink, confirmation bias, routine violations) that seem hardly rational, even from within a situational context. These deviant phenomena require motivational explanations. They call for motivational solutions. People should be motivated to do the right thing, to pay attention, to double check. If they do not, then they should be reminded that it is their duty, their job. Notice how easily we slip back into prehistoric behaviorism: Through a modernist system of rewards and punishments (job incentives, bonuses, threats of retribution) we hope to mold human performance after supposedly fixed features of the world. That psychologists continue to insist on branding such action as irrational, referring it back to some motivational component, may be due to the limits of the conceptual language and power of the discipline. Putatively motivational issues (such as deliberately breaking rules) must themselves be put back into context, to see how human goals (getting the job done fast by not following all the rules to the letter) are made congruent with system goals through a collective of subtle pressures, subliminal messages about organizational preferences, and empirical success of operating outside existing rules.
  • 53. The system wants fast turnaround times, maximization of capacity utilization, efficiency. Given those system goals (which are often kept implicit) , rulebreaking is not a motivational shortcoming, but rather an indication of a well-motivated human operator: Personal goals and system goals are harmonized, which in turn can lead to total system goal displacement: Efficiency is traded off against safety. But psychology often keeps seeing motivational shortcomings. And human factors keeps suggesting countermeasures such as injunctions to follow the rules, better training, or more top-down task analysis. Human factors has trouble incorporating the subtle but powerful influences of organizational environments, structures, processes, and tasks into accounts of individual cognitive practices. In this regard the discipline is conceptually underdeveloped. Indeed, how unstated cultural norms and values travel from the institutional, organizational level to express themselves in individual assessments and actions (and vice versa) is a concern central to sociology, not human factors. Bridging this macromicro connection in the systematic production of rule violations means understanding the dynamic interrelationships between issues as wide ranging as organizational characteristics and preferences, its environment and history; incrementalism in trading safety off against production, unintentional structural secrecy that fragments problem-solving activities across different groups and departments; patterns and representations of safety-related information that are used as imperfect input to organizational decision making, the influence of hierarchies and bureaucratic accountability on people's choice, and others (e.g., Vaughan, 1996, 1999). The structuralist lexicon of human factors and system safety today has no words for many of these concepts, let alone models for how they go together. From Decision Making to Sensemaking In another move away from rationalism and toward the inversion of perspectives (i.e., trying to understand the world the way it looked to the decision maker at the time), large swathes of human factors have embraced the ideas of naturalistic decision making (NDM) over the last decade. By importing cyclical ideas about cognition (situation assessment informs action, which changes the situation, which in turn updates assessment, Neisser, 1976) into a structuralist, normativist psychological lexicon, NDM virtually reinvented decision making (Orasanu & Connolly, 1993). The focus shifted from the actual decision moment, back into the preceding realm of situation assessment. This shift was accompanied by a methodological reorientation, where decision making and decision makers were increasingly studied in their complex, natural environments. Real decision problems, it quickly turned out, resist the rationalistic format dictated for so long by economics: Options are not enumerated exhaustively, access to information is incomplete at best, and people spend more time assessing and measuring up situations than making decisions—if that is indeed what they do at all (Klein, 1998). In contrast to the prescriptions of the normative model, decision makers tend not to generate and evaluate several courses of action concurrently, in order to then determine the best choice. People do not typically have clear or stable sets of preferences along which they can even rank the enumerated courses of action, picking the best one, nor do most complex decision problems actually have a single correct answer. Rather, decision makers in action tend to generate single options at the time, mentally simulate whether this option would work in practice, and then either act on it, or move on to a new line of thought. NDM also takes the role of expertise more seriously than previous decision-making paradigms: What distinguishes good decision makers from bad decision makers most is their ability to make sense of situations by using a highly organized experience base of relevant knowledge. Once again neatly folding into ideas developed by Neisser, such reasoning about situations is more schema- driven, heuristic, and recognitional than it is computational. The typical naturalistic decision setting does not allow the decision maker enough time or information to generate perfect solutions with perfectly rational calculations. Decision making in action calls for judgments under uncertainty, ambiguity and time pressure. In those settings, options that appear to work are better than perfect options that never get computed. The same reconstructive, corrective intervention into history that produces our clear perceptions of errors, also generates discrete decisions. What we see as decision making from the outside is action embedded in larger streams of practice, something that flows forth naturally and continually from situation assessment and reassessment. Contextual dynamics are a joint product of how problems in the world are developing and the actions taken to do something about it. Time becomes nonlinear: Decision and action are interleaved rather than temporally segregated. The decision maker is thus seen as in step with the continuously unfolding environment, simultaneously influenced by it and influencing it through his or her next steps. Understanding decision making, then, requires an understanding of the dynamics that lead up to those supposed decision moments, because by the time we get there, the interesting phenomena have evaporated, gotten lost in the noise of action.
  • 54. NDM research is front-loaded: it studies the front end of decision making, rather than the back end. It is interested, indeed, in sensemaking more than in decision making. Removing decision making from the vocabulary of human factors investigations is the logical next step, suggested by Snook (2000). It would be an additional way to avoid counterfactual reasoning and judgmentalism, as decisions that eventually led up to a bad outcome all too quickly become bad decisions: Framing such tragedies as decisions immediately focuses our attention on an individual making choices . . . such a framing puts us squarely on a path that leads straight back to the individual decision maker, away from the potentially powerful contextual features and right back into the jaws of the fundamental attribution error. "Why did they decide . . . ?" quickly becomes "Why did they make the wrong decision?" Hence, the attribution falls squarely onto the shoulders of the decision maker and away from potent situational factors that influence action. Framing the . . . puzzle as a question of meaning rather than deciding shifts the emphasis away from individual decision makers toward a point somewhere "out there" where context and individual action overlap. (Snook, p. 206) Yet sensemaking is not immune to counterfactual pressure either. If what made sense to the person inside the situation still makes no sense given the outcome, then human factors hastens to point that out (see chap. 5). Even in sensemaking, the normativist bias is an everpresent risk. THE HINDSIGHT BIAS IS NOT A BIAS AND IS NOT ABOUT THE PAST Perhaps the pull in the direction of the position of retrospective outsider is irresistible, inescapable, whether we make lexical adjustments in our investigative repertoire or not. Even with the potentially judgmental notion of decision making removed from the forensic psychological toolbox, it remains incredibly difficult to see the past in its own clothes, not in ours. The fundamental attribution error is alive and well, as Scott Snook puts it (2000, p. 205). We blame the human in the loop and underestimate the influence of context on performance, despite repeated warnings of this frailty in our reasoning. Perhaps we are forever unable to shed our own projection of reality onto the circumstances of people at another time and place. Perhaps we are doomed to digitizing past performance, chunking it up into discrete decision moments that inevitably lure us into counterfactual thinking and judgments of performance instead of explanations. Just as any act of observation changes the observed, our very observations of the past inherently intervene in reality, converting complex histories into more linear, more certain, and disambiguated chronicles. The mechanisms described earlier in this chapter may explain how hindsight influences our treatment of human performance data, but they hardly explain why. They hardly shed light on the energy behind the continual pull toward the position of retrospective outsider; they merely sketch out some of the routes that lead to it. In order to explain failure, we seek failure. In order to explain missed opportunities and bad choices, we seek flawed analyses, inaccurate perceptions, violated rules—even if these were not thought to be influential or obvious or even flawed at the time (Starbuck & Milliken, 1988). This search for failures is something we cannot seem to escape. It is enshrined in the accident models popular in transportation human factors of our age (see 82 CHAPTER 4 chaps. 1 and 2) and proliferated in the fashionable labels for "human error" that human factors keeps inventing (see chap. 6). Even where we turn away from the etic pitfalls of looking into people's decision making, and focus on a more benign, emic, situated sensemaking, the rationalist, normativist perspective is right around the corner. If we know the outcome was bad, we can no longer objectively look at the behavior leading up to it must also have been bad (Fischoff, 1975). To get an idea, think of the Greek mythological figure Oedipus, who shared Jocasta's bed. How large is the difference between Oedipus' memory of that experience before and after he found out that Jocasta was his mother? Once he knew, it was simply impossible for him to look at the experience the same way. What had he missed? Where did he not do his homework? How could he have become so distracted? Outcome knowledge afflicts all retrospective observers, no matter how hard we try not to let it influence us. It seems that bad decisions always have something in common, and that is that they all seemed like a good idea at the time. But try telling that to Oedipus. The Hindsight Bias Is an Error That Makes Sense, Too When a phenomenon is so impervious to external pressure to change, one would begin to suspect that it has some adaptive value, that it helps us preserve something, helps us survive. Perhaps the hindsight bias is not a bias, and perhaps it is not about history. Instead, it may be a highly adaptive, forward looking, rational response to failure. This putative bias may be more about predicting the future than about explaining the past.
  • 55. The linearization and simplification that we see happen in the hindsight bias may be a form of abstraction that allows us to export and project our and others' experiences onto future situations. Future situations can never be predicted at the same level of contextual detail as the new view encourages us to explain past situations. Predictions are possible only because we have created some kind of model for the situation we wish to gain control over, not because we can exhaustively foresee every contextual factor, influence, or data point. This model—any model—is an abstraction away from context, an inherent simplification. The model we create naturally, effortlessly, automatically after past events with a bad outcome inevitably becomes a model of binary choices, bifurcations, and unambiguous decision moments. That is the only useful kind of model we can take with us into the future if we want to guard against the same type of pitfalls and forks in the road. The hindsight bias, then, is about learning, not about explaining. It is forward looking, not backward looking. This applies to ourselves and our own failures as much as it applies to our observations of other people's failures. When confronted by failures that occurred to other people, we may imperatively be tripped into vicarious learning, spurned by our own urge for survival: What do I do to avoid that from happening to me? When confronted by our own performance, we have no privileged insight into our own failures, even if we would like to think we do. The past is the past, whether it is our own or somebody else's. Our observations of the past inevitably intervene and change the observed, no matter whose past it is. This is something that the fundamental attribution error cannot account for. It explains how we overestimate the influence of stable, personal characteristics when we look at other people's failings. We underplay the influence of context or situational factors when others do bad things. But what about our own failings? Even here we are susceptible to reframing past complexity as simple binary decisions, wrong decisions due to personal shortcomings: things we missed, things we should have done or should not have done. Snook (2000) investigated how, in the fog of post Gulf War Iraq, two helicopters carrying U.N. peacekeepers were shot down by American fighter jets. The situation in which the shoot-down occurred was full of risk, role ambiguity, operational complexity, resource pressure, slippage between plans and practice. Yet immediately after the incident, all of this gets converted into binary simplicity (a choice to err or not to err) by DUKE—the very command onboard the airborne control center whose job it was not to have such things happen. Allowing the fighters to shoot down the helicopters was their error, yet they do not blame context at all, as the fundamental attribution error predicts they should. It was said of the DUKE that immediately after the incident: "he hoped we had not shot down our own helicopters and that he couldn't believe anybody could make that dumb a mistake" (Snook, p. 205). It is DUKE himself who blames his own dumb mistake. As with the errors in chapter 3, the dumb mistake is something that jumps into view only with knowledge of outcome, its mistakeness a function of the outcome, its dumbness a function of the severity of the consequences. While doing the work, helping guide the fighters, identifying the targets, all DUKE was doing was his job. It was normal work. He was not sitting there making dumb mistakes. They are a product of hindsight, his own hindsight, directed at his own "mistakes." The fundamental attribution error does not apply. It is overridden. The fighter pilots, too, engage in self-blame, literally converting the ambiguity, risk, uncertainty, and pressure of their encounter with potentially hostile helicopters into a linear series of decision errors, where they repeatedly and consistently took wrong turns on their road to perdition (we misidentified, we engaged, and we destroyed): "Human error did occur. We misidentified the helicopters; we engaged them; and we destroyed them. It was a tragic and fatal mistake" (Tiger 02 quoted in Snook, 2000, p. 205). Again, the fundamental attribution error makes the wrong prediction. If it were true, then these fighter pilots would tend to blame context for their own errors. Indeed, it was a rich enough context—fuzzy, foggy, dangerous, multi-player, pressurized, risky—with plenty of blameworthy factors to go around, if that is where you would look. Yet these fighter pilots do not. "We" misidentified, "we" engaged, "we" destroyed. The pilots had the choice not to; in fact, they had a series of three choices not to instigate a tragedy. But they did. Human error did occur. Of course, elements of self-identity and control are wrapped up in such an attribution, a self-identity for which fighter pilots may well be poster children. It is interesting to note that the tendency to convert past complexity into binary simplicity—into twofold choices to identify correctly or incorrectly, to engage or not, to destroy or not—overrides the fundamental attribution error. This confirms the role of the hindsight bias as a catalyst for learning. Learning (or having learned) expresses itself most clearly by doing something differently in the future, by deciding or acting differently, by removing one's link in the accident chain, as fighter pilot Tiger 02 put it: "Remove any one link in the chain and the outcome would be entirely different. I wish to God I could go back and correct my link in this chain my actions which contributed to this disaster" (Tiger 02, quoted in Snook, 2000, p. 205). We cannot undo the past. We can only undo the future.
  • 56. But undoing the future becomes possible only when we have abstracted away past failures, when we have decontextualized them, stripped them, cleaned them from the fog and confusion of past contexts, highlighted them, blown them up into obvious choice moments that we, and others, had better get right next time around. Prima facie, the hindsight bias is about misassessing the contributions of past failings to bad outcomes. But if the phenomenon is really as robust as it is documented to be and if it actually manages to override the fundamental attribution error, it is probably the expression of more primary mechanisms running right beneath its surface. The hindsight bias is a meaningful adaptation. It is not about explaining past failures. It is about preventing future ones. In preparing for future confrontations with situations where we or others might err again, and do not want to, we are in some sense taking refuge from the banality of accidents thesis. The thought that accidents emerge from murky, ambiguous, everyday decision making renders us powerless to do anything meaningful about it. This is where the hindsight bias is so fundamentally adaptive. It highlights for us where we could fix things (or where we think we could fix things), so that the bad thing does not happen again. The hindsight bias is not a bias at all, in the sense of a departure from some rational norm. The hindsight bias is rational. It in itself represents and sustains rationality. We have to see the past as a binary choice, or a linear series of binary choices, because that is the only way we can have any hope of controlling the future. There is no other basis for learning, for adapting. Even if those adaptations may consist of rather coarse adjustments, of undamped and overcontrolling regulations, even if these adaptations occur at the cost of making oversimplified predictions. But making oversimplified predictions of how to control the future is apparently better than having no predictions at all. Quite in the spirit of Saint Augustine, we accept the reality of errors, and the guilt that comes with it, in the quest for control over our futures. Indeed, the human desire to attain control over the future surely predates the Scientific Revolution. The more refined and empirically testable tools for gaining such control, however, were profoundly influenced and extended by it. Control could best be attained through mechanization and technology away from nature, spirit, away from primitive incantations to divine powers to spare us the next disaster. These Cartesian-Newtonian reflexes have tumbled down the centuries to proffer human factors legitimate routes for gaining control over complex, dynamic futures today and tomorrow. For example, when we look at the remnants of a crashed automated airliner, we, in hindsight, exclaim, "they should have known they were in open descent mode!" The legitimate solution for meeting such technology surprises is to throw more technology at the problem (additional warning systems, paperless cockpits, automatic cocoons). But more technology often creates more problems of a kind we have a hard time anticipating, rather than just solving existing ones. As another example, take the error-counting methods discussed in chapter 3. A more formalized way of turning the hindsight bias into an oversimplified forward looking future controller is hardly imaginable. Errors, which are uniquely the product of retrospective observations conducted from the outside, are measured, categorized, and tabulated. This produces bar charts that putatively point toward a future, jutting their dire predictions of rule violations or proficiency errors out into a dark and fearful time to come, away from a presumed "safe" baseline. It is normativism in pretty forms and colors. These forecasting techniques, which are merely an assignment of categories and numbers to the future, are appearing everywhere. However, their categorical and numerical output can at best be as adequate or as inadequate as the input. Using such forecasts as a strategic tool is only a belief that numbers are meaningful in relation to the fearful future. Strategy becomes a matter of controlling the future by labelling it, rather than continually re-evaluating the uncertain situation. This approach, searching for the right and numerical label to represent the future, is more akin to numerology or astrology. It is the modern-day ritual equivalent of "reading the runes" or "divining the entrails." (Angell & Straub, 1999, p. 184) Human factors holds on to the belief that numbers are meaningful in relation to a fearful future. And why not? Measuring the present and mathematically modeling it (with barcharts, if you must), and thereby predicting and controlling the future has been a legitimate pursuit since at least the 16th century. But as chapters 1, 2, and 3 show, such pursuits are getting to be deeply problematic. In an increasingly complex, dynamic sociotechnical world, their predictive power is steadily eroding. It is not only a problem of garbage in, garbage out (the categorical and numerical output is as adequate or inadequate as the input). Rather, it is the problem of not seeing that we face an uncertain situation in the first place, where mistake, failure, and disaster are incubated in systems much larger, much less transparent, and much less deterministic than the counters of individual errors believe. This, of course, is where the hindsight bias remains a bias.
  • 57. But it is a bias about the future, not about the past. We are biased to believe that thinking about action in terms of binary choices will help us undo bad futures, that it prepares us sufficiently for coming complexity. It does not. Recall how David Woods (2003) put it: Although the past is incredible (DUKE couldn't believe anybody could make that dumb a mistake), the future is implausible. Mapping digitized historic lessons of failure (which span the arc from error bar charts to Tiger O2's wish to undo his link in the chain) into the future will only be partly effective. Stochastic variation and complexity easily outrun our computational capacity to predict with any accuracy. PRESERVATION OF SELF- AND SYSTEM IDENTITY There is an additional sense in which our dealings with past failures go beyond merely understanding what went wrong and preventing recurrence. Mishaps are surprising relative to prevailing beliefs and assumptions about the system in which they happen, and investigations are inevitably affected by the concern to reconcile a disruptive event with existing views and beliefs about the system. Such reconciliation is adaptive too. Our reactions to failure, and our investigations into failure, must be understood against the backdrop of the "fundamental surprise error" (Lanir, 1986) and examined for the role they play in it. Accidents tend to create a profound asymmetry between our beliefs (or hopes) of a basically safe system, and new evidence that may suggest that it is not. This is the fundamental surprise: the astonishment that we feel when the most basic assumptions we held true about the world may turn out to be untrue. The asymmetry creates a tension, and this tension creates pressure for change: Something will have to give. Either the belief needs changing (i.e., we have to acknowledge that the system is not basically safe—that mistake, mishap, and disaster are systematically organized by that system; Vaughan, 1996), or we change the people involved in the mishap even if this means us. We turn them into unrepresentative, uniquely bad individuals: The pilots of a large military helicopter that crashed on a hillside in Scotland in 1994 were found guilty of gross negligence. The pilots did not survive—29 people died in total—so their side of the story could never be heard. The official inquiry had no problems with "destroying the reputation of two good men," as a fellow pilot put it. Potentially fundamental vulnerabilities (such as 160 reported cases of Uncommanded Flying Control Movement or UFCM in computerized helicopters alone since 1994) were not looked into seriously. (Dekker, 2002, p. 25) When we elect to "destroy the reputation of two good men," we have committed the fundamental surprise error. We have replaced a fundamental challenge to our assumptions, our beliefs (the fundamental surprise) with a mere local one: The pilots were not as good as we thought they were, or as good as they should have been. From astonishment (and its concomitant: fear about the basic safety of the system, as would be raised by 160 cases of UFCM) we move to mere, local wonder: How could they not have seen the hill? They must not have been very good pilots after all. Thus we strive to preserve our self- and system identity. We pursue an adaptive strategy of safeguarding the essence of our world as we understand it. By letting the reputation of individual decision makers slip, we have relieved the tension between broken beliefs (the system is not safe after all) and fervent hopes that it still is. That phenomena such as the hindsight bias and the fundamental attribution error may not be primary, but rather ancillary expressions of more adaptive, locally rational, and useful identity-preserving strategies for the ones committing them, is consonant with observations of a range of reasoning errors. People keep committing them not because they are logical (i.e., globally rational) or because they only produce desired effects, but because they serve an even weightier purpose: "This dynamic, this 'striving to preserve identity,' however strange the means or effects of such striving, was recognised in psychiatry long ago. [This phenomenon] is seen not as primary, but as attempts (however misguided) at restitution, at reconstructing a world reduced by complete chaos" (Sacks, 1998, p. 7). However "strange the means or effects of such striving," the fundamental surprise error allows us to rebuild a world reduced by chaos. And the hindsight bias allows us to predict and avoid future roads to perdition. Through the fundamental surprise error, we rehabilitate our faith in something larger than ourselves, something in which we too are vulnerable to breakdown, something that we too are at the mercy of in varying degrees. Breaking out of such locally rational reasoning, where the means and consequences of our striving for preservation and rehabilitation create strange and undesirable side effects (blaming individuals for system failures, not learning from accidents, etc.) requires extraordinary courage. It is not very common. Yet people and institutions may not always commit the fundamental surprise error, and may certainly not do so intentionally. In fact, in the immediate aftermath of failure, people may be willing to question their underlying assumptions about the system they use or operate.
  • 58. Perhaps things are not as safe as previously thought; perhaps the system contains vulnerabilities and residual weaknesses that could have spawned this kind of failure earlier, or worse, could do it again. Yet such openness does not typically last long. As the shock of an accident subsides, parts of the system mobilize to contain systemic self-doubt and change the fundamental surprise into a merely local hiccup that temporarily ruffled an otherwise smooth operation. The reassurance is that the system is basically safe. It is only some people or other parts in it that are unreliable In the end, it is not often that an existing view of a system gives in to the reality of failure. Instead, to redress the asymmetry, the event or the players in it are changed to fit existing assumptions and beliefs about the system, rather than the other way around. Expensive lessons about the system as a whole, and the subtle vulnerabilities it contains, can go completely unlearned. Our inability to deal with the fundamental surprise of failure shines through the investigations we commission. The inability to really learn is sometimes legitimized and institutionalized through resourceintensive official investigations. The cause we end up attributing to an accident may sometimes be no more than the "cause" we can still afford, given not just our financial resources, but also our complex of hopes and beliefs in a safe and fair world. As Perrow (1984) has noted: Formal accident investigations usually start with an assumption that the operator must have failed, and if this attribution can be made, that is the end of serious inquiry. Finding that faulty designs were responsible would entail enormous shutdown and retrofitting costs; finding that management was responsible would threaten those in charge, but finding that operators were responsible preserves the system, with some soporific injunctions about better training, (p. 146) Real change in the wake of failure is often slow to come. Few investigations have the courage to really challenge our beliefs. Many keep feeding the hope that the system is still safe—except for this or that little broken component, or this or that Bad Apple. The lack of courage shines through how we deal with human error, through how we react to failure. It affects the words we choose, the rhetoric we rely on, the pathways for "progress" we put our bets on. Which cause can we afford? Which cause renders us too uncomfortable? Accuracy is not the dominant criterion, but plausibility is plausibility from the perspective of those who have to accommodate the surprise that the failure represents for them and their organization, their worldview. "Is it plausible?," is the same as asking, "Can we live (on) with this explanation? Does this explanation help us come to terms with the puzzle of bad performance?" Answering this question, and generating such comfort and selfassurance, is one purpose that our analysis of past failures has to fulfill, even if it becomes a selective oversimplification because of it. Even if, in the words of Karl Weick (1995), they make lousy history 5 Chapter If You Lose Situation Awareness, What Replaces It? The hindsight bias has ways of getting entrenched in human factors thinking. One such way is our vocabulary. "Losing situation awareness" or "deficient situation awareness" have become legitimate characterizations of cases where people did not exactly know where they were or what was going on around them. In many applied as well as some scientific settings, it is acceptable to submit "loss of situation awareness" as an explanation for why people ended up where they should not have ended up, or why they did what they, in hindsight, should not have done. Navigational incidents and accidents in transportation represent one category of cases where the temptation to rely on situation awareness as an elucidatory construct appears irresistible. If people end up where they should not, or where they did not intend to end up, it is easy to see that as a deficient awareness of the cues and indications around them. It is easy to blame a loss of situation awareness. One such accident happened to the Royal Majesty, a cruise ship that was sailing from Bermuda to Boston in the Summer of 1995. It had more than 1,000 people onboard. Instead of Boston, the Royal Majesty ended up on a sandbank close to the Massachusetts shore. Without the crew noticing, it had drifted 17 miles off course during a day and a half of sailing (see Fig. 5.1). Investigators discovered afterward that the ship's autopilot had defaulted to DR (Dead Reckoning) mode (from NAV, or Navigation mode) shortly after departure. DR mode does not compensate for the effects of wind and other drift (waves, currents), which NAV mode does. A northeasterly wind pushed the ship steadily off its course, to the side of its intended track. The U.S. National Transportation Safety Board investigation into the FIG. 5.1. The difference between where the Royal Majesty crew thought they were headed (Boston) and where they actually ended up: a sandbank near Nantucket accident judged that "despite repeated indications, the crew failed to recognize numerous opportunities to detect that the vessel had drifted off track" (NTSB, 1995, p. 34). But "numerous opportunities" to detect the nature of the real situation become clear only in hindsight.
  • 59. With hindsight, once we know the outcome, it becomes easy to pick out exactly those clues and indications that would have shown people where they were actually headed. If only they had focused on this piece of data, or put less confidence in that indication, or had invested just a little more energy in examining this anomaly, then they would have seen that they were going in the wrong direction. In this sense, situation awareness is a highly functional or adaptive term for us, for those struggling to come to terms with the rubble of a navigational accident. Situation awareness is a notation that assists us in organizing the evidence available to people at the time, and can provide a starting point for understanding why this evidence was looked at differently, or not at all, by those people. Unfortunately, we hardly ever push ourselves to such understanding. Loss of situation awareness is accepted as sufficient explanation too quickly too often, and in those cases amounts to nothing more than saying human error under a fancy new label. The kinds of notations that are popular in various parts of the situationawareness literature are one indication that we quickly stop investigating, researching any further, once we have found human error under that new guise. Venn-type diagrams, for example, can point out the mismatch between actual and ideal situation awareness. They illustrate the difference between what people were aware of in a particular situation, and what they could (or should) ideally have been aware of (see Fig. 5.2). Once we have found a mismatch between what we now know about the situation (the large circle) and what people back then apparently knew about it (the small one), that in itself is explanation enough. They did not know, but they could or should have known. This does not apply only in retrospect, by the way. Even design problems can be clarified through this notation, and performance predictions can be made on its basis. When the aim is designing for situation awareness, the Venn diagram can show what people should pick up in a given setting, versus what they are likely to actually pick up. In both cases, awareness is a relationship between that which is objectively available to people in the outside world on the one hand, and what they take in, or understand about it, on the other. Terms such as deficient situation awareness or loss of situation awareness confirm human factors' dependence on a kind of subtractive model of awareness. The Venn diagram notation can also be expressed in an equation that reflects this: Loss of SA= /(large circle - small circle) (1) In Equation 1, "loss of SA" equals "deficient SA" and SA stands for situation awareness. This also reveals the continuing normativist bias in our understanding of human performance. Normativist theories aim to explain mental processes by reference to ideals or normative standards that describe optimal strategies. The large Venn circle is the norm, the standard, the ideal. Situation awareness is explained by reference to that ideal: Actual situation awareness is a subtraction from that ideal, a shortfall, a deficit, indeed, a "loss." Equation 2 becomes: Loss of SA = /(what I know now - what you knew then) (2) Loss of situation awareness, in other words, is the difference between what I know about a situation now (especially the bits highlighted by hindsight) and what somebody else apparently knew about that situation then. Interestingly, situation awareness is nothing by itself, then. It can only be expressed as a relative, normativist function, for example, the difference between what people apparently knew back then versus what they could or should have known (or what we know now). Other than highlighting the mismatch between the large and the little circle in a Venn diagram, other than revealing the elements that were not seen but could or should have been seen and understood, there is little there. Discourse and research around situation awareness may so far have shed little light on the actual processes of attentional dynamics. Rather, situation awareness has given us a new normativist lexicon that provides large, comprehensive notations for perplexing data (How could they not have noticed? Well, they lost SA). Situation awareness has legitimized the proliferation of the hindsight bias under the pretext of adding to the knowledge base. None of this may really help our understanding of awareness in complex dynamic situations, instead truncating deeper research, and shortchanging real insight. THE MIND-MATTER PROBLEM Discourse about situation awareness is a modern installment of an ancient debate in philosophy and psychology, about the relationship between matter and mind. Surely one of the most vexing problems, the coupling between matter and mind, has occupied centuries of thinkers. How is it that we get data from the outside world inside our minds? What are the processes by which this happens? And how can the products of these processes be so divergent (I see other things than you, or other things than I saw yesterday)? All psychological theories, including those of situation awareness, implicitly or explicitly choose a position relative to the mindmatter problem. Virtually all theories of situation awareness rely on the idea of correspondence a match, or correlation, between an external world of stimuli (elements) and an internal world of mental representations (which gives
  • 60. meaning to those stimuli). The relationship between matter and mind, in other words, is one of letting the mind create a mirror, a mental simile, of matter on the outside. This allows a further elaboration of the Venn diagram notation: Instead of "ideal" versus "actual" situation awareness, the captions of the circles in the diagram could read "matter" (the large circle) and "mind" (the little circle). Situation awareness is the difference between what is out there in the material world (matter) and what the observer sees or understands (mind). Equation 1 can be rewritten as Equation 3: Loss of SA = /(matter - mind) (3) Equation 3 describes how a loss of situation awareness, or deficient situation awareness, is a function of whatever was in the mind subtracted from what was available matter. The portion of matter that did not make it into mind is lost, it is deficient awareness. Such thinking, of course, is profoundly Cartesian. It separates mind from matter as if they are distinct entities: a res cogitans and a res extensa. Both exist as separate essentials of our universe, and one serves as the echo, or imitate, of the other. One problem of such dualism lies, of course, in the assumptions it makes. An important assumption is what Feyerabend called the "autonomy principle" (see chap. 3): that facts exist in some objective world, equally accessible to all observers. The autonomy principle is what allows researchers to draw the large circle of the Venn diagram: It consists of matter available to, as well as independent from, any observer, whose awareness of that matter takes the form of an internal simile. This assumption is heavily contested by radical empiricists. Is there really a separation between resextensa and res cogitans? Can we look at matter as something "out there," as something independent of (the minds of) observers, as something that is open to enter the awareness of anyone? If the autonomy principle is right, then deficient situation awareness is a result of the actual SA Venn circle being too small, or misdirected relative to the large (ideal SA) Venn circle. But if you lose situation awareness, what replaces it? No theories of cognition today can easily account for a mental vacuum, for empty-headedness. Rather, people always form an understanding of the situation unfolding around them, even if this understanding can, in hindsight, be shown to have diverged from the actual state of affairs. This does not mean that this mismatch has any relevance in explaining human performance at the time. For people doing work in a situated context, there is seldom a mismatch, if ever. Performance is driven by the desire to construct plausible, coherent accounts, a good story of what is going on. Weick (1995) reminded us that such sensemaking is not about accuracy, about achieving an accurate mapping between some objective, outside world and an inner representation of that world. What matters to people is not to produce a precise internalized simile of an outside situation, but to account for their sensory experiences in a way that supports action and goal achievement. This converts the challenge of understanding situation awareness. From studying the mapping accuracy between an external and internal world, it requires the investigation of why people thought they were in the right place, or had the right assessment of the situation around them. What made that so? The adequacy or accuracy of an insider's representation of the situation cannot be called into question: It is what counts for him or her, and it is that which drives further action in that situation. The internal, subjective world is the only one that exists. If there is an objective, external reality, we could not know it. Getting Lost at the Airport Let us now turn to a simple case, to see how these things play out. Runway incursions (aircraft taxiing onto runways for which they did not have a clearance) are an acute category of such cases in transportation today. Runway incursions are seen as a serious and growing safety problem worldwide, especially at large, controlled airports (where air-traffic control organizes and directs traffic movements). Hundreds of incursions occur every year, some leading to fatal accidents. Apart from straying onto a runway without a clearance, the risk of colliding with something else at an airport is considerable. Airports are tight and dynamic concentrations of cars, buses, carts, people, trucks, trucks plus aircraft, and aircraft, all moving at speeds varying from a few knots to hundreds of knots. (And then fog can settle over all of that). The number of things to hit is much larger on the ground than it is in the air, and the proximity to those things is much closer. And because of the layout of taxiways and ramps, navigating an aircraft across an airport can be considerably more difficult than navigating it in flight. When runway incursions occur, it can be tempting to blame a loss of situation awareness. Here is one such case, not a runway incursion, but a taxiway incursion. This case is illustrative not only because it is relatively simple, but also because all regulations had been followed in the design and layout of the airport. Safety cases had been conducted for the airport, and it had been certified as compliant with all relevant rules. Incidents in such an otherwise safe system can happen even when everybody follows the rules. +
  • 61. This incident happened at Stockholm Arlanda (the international airport) in October 2002. A Boeing 737 had landed on runway 26 (the opposite of runway 08, which can be seen in Fig. 5.3) and was directed by air traffic control to taxi to its gate via taxiways ZN and Z (called "Zulu November" and "Zulu," in aviation speak). The reason for taking ZN was that a tow truck with an aircraft was coming from the other direction. It had been cleared to use ZP (Zulu Papa) and then to turn right onto taxiway X (X-ray). But the 737 did not take ZN. To the horror of the tow truck driver, it carried on following ZP instead, almost straight into the tow truck. The pilots saw the truck in time, however, and managed to stop. The truck driver had to push his aircraft backward in order to clear up the jam. Did the pilots lose situation awareness? Was their situation awareness deficient? There were signs pointing out where taxiway ZN ran, and those could be seen from the cockpit. Why did the crew not take these cues into account when coming off the runway? Such questions consistently pull us toward the position of retrospective outsider, looking down onto the developing situation from a God's-eye point of view. From there we can see the mismatch grow between where people were and where they thought they were. From there we can easily draw the circles of the Venn diagram, pointing out a deficiency or a shortcoming in the awareness of the people in question. But none of that explains much. The mystery of the matter-mind problem is not going to go away just because we say that other people did not see what we now know they should have seen. The challenge is to try to understand why the crew of the 737 thought that they were right—that they were doing exactly what air-traffic control had told them to do: follow taxiway ZN to Z. The commitment of an antidualist position is to try to see the world through the eyes of the protagonists, as there is no other valid perspective. The challenge with navigational incidents, indeed, is not to point out that people were not in the spot they thought they were, but to explain why they thought they were right. The challenge is to begin to understand on the basis of what cues people (thought) they knew where they were. The first clue can be found in the response of the 737 crew after they had been reminded by the tower to follow ZN (they had now stopped, facing the tow truck head-on). 'Yeah, it's the chart here that's a little strange," said one of the pilots (Statens Haverikommision, 2003, p. 8). If there was a mismatch, it was not between the actual world and the crew's model of that world. Rather, there was a mismatch between the chart in the cockpit and the actual airport layout. As can be seen in Fig. 5.3, the taxiway layout contained a little island, or roundabout, between taxiways Zulu and X-ray. ZN and ZP were the little bits going between X-ray to Zulu, around the roundabout. But the chart available in the cockpit had no little island on it. It showed no roundabout (see Fig. 5.4). Even here, no rules had been broken. The airport layout had recently changed (with the addition of the little roundabout) in connection with the construction of a new terminal pier. It takes time for the various charts to be updated, and this simply had not happened yet at the company of the crew in question. Still, how could the crew of the 737 have ended up on the wrong side of that area (ZP instead of ZN), whether there was an island shown on their charts or not? Figure 5.5 contains more clues. It shows the roundabout from the height of a car (which is lower than a 737 cockpit, but from the same direction as an aircraft coming off of runway 26). The crew in question went left of the roundabout, where it should have gone right. The roundabout is covered in snow, which makes it inseparable from the other (real) islands separating taxiways Zulu and X-ray. These other islands consist of grass, whereas the roundabout, with a diameter of about 20 meters, is the same tarmac as that of the taxiways. Shuffle snow onto all of them, however, and they look indistinguishable. The roundabout is no FIG. 5.4. The chart available in the Boeing 737 cockpit at the time. It shows no little roundabout, or island, separating taxiways ZN from ZP (Statens Haverikommision, 2003). FIG. 5.5. The roundabout as seen coming off runway 26. The Boeing 737 went left around the roundabout, instead of right (Statens Haverikommision, 2003). longer a circle painted on the tarmac: It is an island like all others. Without a roundabout on the cockpit chart, there is only one plausible explanation for what the island in Fig. 5.5 ahead of the aircraft is: It must be the grassy island to the right of taxiway ZN. In other words, the crew knew where they were, based on the cues and indications available to them, and based on what these cues plausibly added up to. The signage, even though it breaks no rules either, does not help. Taxiway signs are among the most confusing directors in the world of aviation, and they are terribly hard to turn into a reasonable representation of the taxiway system they are supposed to help people navigate on. The sign visible from the direction of the Boeing 737 is enlarged in Fig. 5.6. The black part of the sign is the position part (this indicates what taxiway it is), and the yellow part is the direction part: This taxiway (ZN) will lead to taxiway Z, which happens to run at about a right angle across ZN. These signs are placed to the left of the taxiway they belong to. In other words, the ZN taxiway is on the right of the sign, not on the left. But put the sign in the context of Fig. 5.5, and things become more ambiguous. The black ZN part is now leaning toward the left side of the roundabout, not the right side. Yet the ZN part belongs to the piece of tarmac on the right of the roundabout.
  • 62. FIG. 5.6. The sign on the roundabout that is visible for aircraft coming off runway 26 (Statens Haverikommision, 2003). The crew never saw as such. For them, given their chart, the roundabout was the island to the right of ZN. Why not swap the black ZN and the yellow Z parts? Current rules for taxiway signage will not allow it (rules can indeed stifle innovation and investments in safety). And not all airports comply to this religiously either. There may be exceptions when there is no room or when visibility of the sign would be obstructed if placed on the left side. To make things worse, regulations state that taxiway signs leading to a runway need to be placed on both sides of the taxiway. In those cases, the black parts of the signs are often actually adjacent to the taxiway, and not removed from it, as in Fig. 5.5. Against the background of such ambiguity, few pilots actually know or remember that taxiway signs are supposed to be on the left side of the taxiway they belong to. In fact, very little time in pilot training is used to get pilots to learn to navigate around airports, if any. It is a peripheral activity, a small portion of mundane, pedestrian work that merely leads up to and concludes the real work: flying from A to B. When rolling off the runway, and going to taxiway Zulu and then to their gate, this crew knew where they were. Their indications (cockpit chart, snow-covered island, taxiway sign) compiled into a plausible story: ZN, their assigned route, was the one to the left of the island, and that was the one they were going to take. Until they ran into a tow truck, that is. But nobody in this case lost situation awareness. The pilots lost nothing. Based on the combination of cues and indications observable by them at the time, they had a plausible story of where they were. Even if a mismatch can be shown between how the pilots saw their situation and how retrospective, outside observers now see that situation, this has no bearing on understanding how the pilots made sense of their world at the time. Seeing situation awareness as a measure of the accuracy of correspondence between some outer world and an inner representation carries with it a number of irresolvable problems that have always been connected to such a dualist position. Taking the mind-matter problem apart by separating the two means that the theory needs to connect the two again. Theories of situation awareness typically rely on a combination of two schools in psychological thought to reconstitute this tie, to make this bridge. One is empiricism, a traditional school in psychology that makes claims on how knowledge is chiefly, if not uniquely, based on experience. The second is the information-processing school in cognitive psychology, still popular in large areas of human factors. None of these systems of thought, however, are particularly successful in solving the really hard questions about situation awareness, and may in fact be misleading in certain respects. We look at both of them in turn here. Once that is done, the chapter briefly develops the counterposition on the mind-matter question: an antidualist one (as related to situation awareness). This position is worked out further in the rest of this chapter, using the Royal Majesty case as an example. EMPIRICISM AND THE PERCEPTION OF ELEMENTS Most theories of situation awareness actually leave the processes by which matter makes it into mind to the imagination. A common denominator, however, appears to be the perception of elements in the environment. Elements are the starting point of perceptual and meaning-making processes. It is on the basis of these elements that we gradually build up an understanding of the situation, by processing such elementary stimulus information through multiple stages of consciousness or awareness ("levels of SA"). Theories of situation awareness borrow from empiricism (particularly British empiricism), which assumes that the organized character and the meaningfulness of our perceptual world are achieved by matching incoming stimuli with prior experience through a process called association. In other words, the world as we experience it is disjointed (consisting of elements) except when mediated by previously stored knowledge. Correspondence between mind and matter is made by linking incoming impressions through earlier associations. Empiricism in its pure form is nothing more than saying that the major source of knowledge is experience, that we do not know about the world except through making contact with that world with our sensory organs. Among Greek philosophers around the 5th century B.C., empiricism was accepted as a guide to epistemology, as a way of understanding the origin of knowledge. Questions already arose, however, on whether all psychic life could be reduced to sensations. Did the mind have a role to play at all in turning perceptual impressions into meaningful percepts? The studies of perception by Johannes Kepler (1571-1630) would come to suggest that the mind had a major role, even though he himself left the implications of his findings up to other theoreticians. Studying the eyeball, Kepler found that it actually projects an inverted image on the retina at the back. Descartes, himself dissecting the eye of a bull to see what image it would produce, saw the same thing. If the eye inverts the world, how can we see it the right way up? There was no choice but to appeal to mental processing.
  • 63. Not only is the image inverted, it is also two-dimensional, and it is cast onto the backs of two eyes, not one. How does all that get reconciled in a single coherent, upright percept? The experiments boosted the notion of impoverished, meaning-deprived stimuli entering our sensory organs, in need of some serious mental processing work from there on. Further credence to the perception of elements was given by the 19th-century discovery of photoreceptors in the human eye. This mosaic of retinal receptors appeared to chunk up any visual percept coming into the eyeball. The resulting fragmented neural signals had to be sent up the perceptual pathway in the brain for further processing and scene restoration. British empiricists such as John Locke (1632-1704) and George Berkeley (1685-1753), though not privy to 19th-century findings, were confronted with the same epistemological problems that their Greek predecessors had struggled with. Rather than knowledge being innate, or the chief result of reasoning (as claimed by rationalists of that time), what role did experience have in creating knowledge? Berkeley, for example, wrestled with the problem of depth perception (not a negligible problem when it comes to situation awareness). How do we know where we are in space, in relation to objects around us? Distance perception, to Berkeley, though created through experience, was itself not an immediate experience. Rather, distance and depth are additional aspects of visual data that we learn about through combinations of visual, auditory, and tactile experiences. We understand distance and depth in current scenes by associating incoming visual data with these earlier experiences. Berkeley reduced the problem of space perception to more primitive psychological experiences, decomposing the perception of distance and magnitude into constituent perceptual elements and processes (e.g., lenticular accommodation, blurring of focus) . Such deconstruction of complex, intertwined psychological processes into elementary stimuli turned out to be a useful tactic. It encouraged many after him, among them Wilhelm Wundt and latter-day situation-awareness theorists, to analyze other experiences in terms of elements as well. Interestingly, neither all prehistoric empiricists nor all British empiricists could be called dualists in the same way that situation-awareness theorists can be. Protagoras, a contemporary of Plato around 430 B.C., had already said that "man is the measure of all things." An individual's perception is true to him, and cannot be proven untrue (or inferior or superior) by some other individual. Today's theories of situation awareness, with their emphasis on the accuracy of the mapping between matter and mind, are very much into inferiority and superiority (deficient SA vs. good SA) as something that can be objectively judged. This would not have worked for some of the British empiricists either. To Berkeley, who disagreed with earlier characterizations of an inner versus an outer world, people can actually never know anything but their experiences. The world is a plausible but unproved hypothesis. In fact, it is a fundamentally untestable hypothesis, because we can only know our own experience. Like Protagoras before him, Berkeley would not have put much stock in claims of the possibility of superior or ideal situation awareness, as such a thing is logically impossible. There are no superlatives when it comes to knowledge through experience. For Berkeley too, this meant that even if there is an objective world out there (the large circle in the Venn diagram), we could never know it. It also meant that any characterization of such an objective world with the aim of understanding somebody's perception, somebody's situation awareness, would have been nonsense. Experiments, Empiricism, and Situation Awareness Wilhelm Wundt is credited with founding the first psychological laboratory in the world at the University of Leipzig in the late 1870s. The aim of his laboratory was to study mental functioning by deconstructing it into separate elements. These could then be combined to understand perceptions, ideas, and other associations. Wundt's argument was simple and compelling, and versions of it are still used in psychological method debates today. Although the empirical method had been developing all around psychology, it was still occupied with grand questions of consciousness, soul, and destiny, and it tried to gain access to these issues through introspection and rationalism. Wundt argued that these were questions that should perhaps be asked at the logical end point of psychology, but not at the beginning. Psychology should learn to crawl before trying to walk. This justified the appeal of the elementarist approach: chopping the mind and its stimuli up into minute components, and studying them one by one. But how to study them? Centuries before, Descartes had argued that mind and matter not only were entirely separate, but should be studied using different methods as well. Matter should be investigated using methods from natural science (i.e., the experiment), whereas mind should be examined through processes of meditation, or introspection.
  • 64. Wundt did both. In fact, he combined the natural science tradition with the introspective one, molding them into a novel brand of psychological experimentation that still governs much of human factors research to this day. Relying on complicated sets of stimuli, Wundt investigated sensation and perception, attention, feeling, and association. Using intricate measurements of reaction times, the Leipzig laboratory hoped they would one day be able to achieve a chronometry of mind (which was not long thereafter dismissed as infeasible). Rather than just counting on quantitative experimental outcomes, Wundt asked his subjects to engage in introspection, to reflect on what had happened inside their minds during the trials. Wundt's introspection was significantly more evolved and demanding than the experimental "report" psychologists ask their subjects for today. Introspection was a skill that required serious preparation and expertise, because the criteria for gaining successful access to the elementary makeup of mind were set very high. As a result, Wundt mostly used his assistants. Realizing that the contents of awareness are in constant flux, Wundt produced rigid rules for the proper application of introspection: (a) The observer, if at all possible, must be in a position to determine when the process is to be introduced; (b) he or she must be in a state of "strained attention"; (c) the observation must be capable of being repeated several times; (d) the conditions of the experiment must be such that they are capable of variation through introduction or elimination of certain stimuli and through variation of the strength and quality of the stimuli. Wundt thus imposed experimental rigor and control on introspection. Similar introspective rigor, though different in some details and prescriptions, is applied in various methods for studying situation awareness today. Some techniques involve blanking or freezing of displays, with researchers then going in to elicit what participants remember about the scene. This requires active introspection. Wundt would have been fascinated, and he probably would have had a thing or two to say about the experimental protocol. If subjects are not allowed to say when the blanking or freezing is to be introduced, for example (Wundt's first rule), how does that compromise their ability to introspect? In fact, the blanking of displays and handing out a situation-awareness questionnaire is more akin to the Wurzburg school of experimental psychology that started to compete with Wundt in the late 19th century. The Wurzburgers pursued "systematic experimental introspection" by having subjects pursue complex tasks that involved thinking, judging, and remembering. They would then have their subjects render a retrospective report of their experiences during the original operation. The whole experience had to be described time period by time period, thus chunking it up. In contrast to Wundt, and like situationawareness research participants today, Wurzburg subjects did not know in advance what they were going to have to introspect. Others today disagree with Descartes' original exhortation and remain fearful of the subjectivist nature of introspection. They favor the use of clever scenarios in which the outcome, or behavioral performance of people, will reveal what they understood the situation to be. This is claimed to be more of a natural science approach that stays away from the need to introspect. It relies on objective performance indicators instead. Such an approach to studying situation awareness could be construed as neobehaviorist, as it equates the study of behavior with the study of consciousness. Mental states are not themselves the object of investigation: Performance is. If desired, such performance can faintly hint at the contents of mind (situation awareness). But that itself is not the aim; it cannot be, because through such pursuits psychology (and human factors) would descend into subjectivism and ridicule. Watson, the great proponent of behaviorism, would himself have argued along these lines. Additional arguments in favor of performance-oriented approaches include the assertation that introspection cannot possibly test the contents of awareness, as it necessarily appeals to a situation or stimulus from the past. The situation on which people are asked to reflect has already disappeared. Introspection thus simply probes people's memory. Indeed, if you want to study situation awareness, how can you take away the "situation" by blanking or freezing their world, and still hope you have relevant "awareness" left to investigate by introspection? Wundt, as well as many of today's situation awareness researchers, may in part have been studying memory, rather than the contents of consciousness. Wundt was, and still remains, one of the chief representatives of the elementarist orientation, pioneered by Berkeley centuries before, and perpetuated in modern theories of situation awareness. But if we perceive elements, if the eyeball deals in two-dimensional, fragmented, inverted, meaning-deprived stimuli, then how does order in our perceptual experience come about? What theory can account for our ability to see coherent scenes, objects? The empiricist answer of association is one way of achieving such order, of creating such interelementary connections and meaning. Order is an end product, it is the output of mental or cognitive work.
  • 65. This is also the essence of information processing, the school of thought in cognitive psychology that accompanied and all but colonized human factors since its inception during the closing days of the Second World War. Meaning and perceptual order are the end result of an internal trade in representations, representations that get increasingly filled out and meaningful as a result of processing in the mind. INFORMATION PROCESSING Information processing did not follow neatly on empiricism, nor did it accompany the initial surge in psychological experimentation. Wundt's introspection did not immediately fuel the development of theoretical substance to fill the gap between elementary matter and the mind's perception of it. Rather, it triggered an anti- subjectivist response that would ban the study of mind and mental processes for decades to come, especially in North America. John Watson, a young psychologist, introduced psychology to the idea of behaviorism in 1913, and aimed to conquer psychology as a purely objective branch of natural science. Introspection was to be disqualified, and any references to or investigations of consciousness were proscribed. The introspective method was seen as unreliable and unscientific, and psychologists had to turn their focus exclusively to phenomena that could be registered and described objectively by independent observers. This meant that introspection had to be replaced by tightly controlled experiments that varied subtle combinations of rewards and punishments in order to bait organisms (anything from mice to pigeons to humans) into particular behaviors over others. The outcome of such experiments was there for all to see, with no need for introspection. Behaviorism became an early 20th-century embodiment of the Baconian ideal of universal control, this time reflected in a late Industrial Revolution obsession with manipulative technology and domination. It appealed enormously to an optimistic, pragmatic, rapidly developing, and result-oriented North America. Laws extracted from simple experimental settings were thought to carry over to more complex settings and to more experiential phenomena as well, including imagery, thinking, and emotions. Behaviorism was thus fundamentally nomothetic: deriving general laws thought to be applicable across people and settings. All human expressions, including art and religion, were reduced to no more than conditioned responses. Behaviorism turned psychology into something wonderfully Newtonian: a schedule of stimuli and responses, of mechanistic, predictable, and changeable couplings between inputs and outputs. The only legitimate characterization of psychology and mental life was one that conformed to the Newtonian framework of classical physics, and abided by its laws of action and reaction. Then came the Second World War, and the behaviorist bubble was deflated. No matter how clever a system of rewards and punishments psychologists set up, radar operators monitoring for German aircraft intruding into Britain across the Channel would still lose their vigilance over time. They would still have difficulty distinguishing signals from noise, independent of the possible penalties. Pilots would get controls mixed up, and radio operators were evidently limited in their ability to hold information in their heads while getting ready for the next transmission. Where was behaviorism? It could not answer to the new pragmatic appeals. Thus came the first cognitive revolution. The cognitive revolution reintroduced mind as a legitimate object of study. Rather than manipulating the effect of stimuli on overt responses, it concerned itself with "meaning" as the central concept of psychology. Its aim was, as Bruner (1990) recalled, to discover and describe meanings that people created out of their encounters with the world, and then to propose hypotheses for what meaning-making processes were involved. The very metaphors, however, that legitimized the reintroduction of the study of mind also began to immediately corrupt it. The first cognitive revolution fragmented and became overly technicalized. The radio and the computer, two technologies accelerated by developments during the Second World War, quickly captured the imagination of those once again studying mental processes. These were formidable similes of mind, able to mechanistically fill the black box (which behaviorism had kept shut) between stimulus and response. The innards of a radio showed filters, channels, and limited capacities through which informa- tion flowed. Not much later, all those words appeared in cognitive psychology. Now the mind had filters, channels, and limited capacities too. The computer was even better, containing a working memory, a long-term memory, various forms of storage, input and output, and decision modules. It did not take long for these terms, too, to appear in the psychological lexicon. What seemed to matter most was the ability to quantify and compute mental functioning. Information theory, for example, could explain how elementary stimuli (bits) would flow through processing channels to produce responses. A processed stimulus was deemed informative if it reduced alternative choices, whether the stimulus had to do with Faust or with a digit from a statistical table.
  • 66. As Bruner (1990) recollected, computability became the necessary and sufficient criterion for cognitive theories. Mind was equated to program. Through these metaphors, the "construction of meaning" quickly became the "processing of information." Newton and Descartes simply would not let go. Once again, psychology was reduced to mechanical components and linkages, and exchanges of energy between and along them. Testing the various components (sensory store, memory, decision making) in endless series of fractionalized laboratory experiments, psychologists hoped, and many are still hoping, that more of the same will eventually add up to something different, that profound insight into the workings of the whole will magically emerge from the study of constituent components. The Mechanization of Mind Information processing has been a profoundly Newtonian-Cartesian answer to the mind-matter problem. It is the ultimate mechanization of mind. The basic idea is that the human (mind) is an information processor that takes in stimuli from the outside world, and gradually makes sense of those stimuli by combining them with things already stored in the mind. For example, I see the features of a face, but through coupling them to what I have in long-term memory, I recognize the face as that of my youngest son. Information processing is loyal to the biological-psychological model that sees the matter-mind connection as a physiologically identifiable flow of neuronal energy from periphery to center (from eyeball to cortex), along various nerve pathways. The information-processing pathway of typical models mimics this flow, by taking a stimulus and pushing it through various stages of processing, adding more meaning the entire time. Once the processing system has understood what the stimulus means (or stimuli mean) can an appropriate response be generated (through a backflow from center to periphery, brain to limbs) that in turn creates more (new) stimuli to process. The Newtonian as well as dualist intimations of the information-processing model are a heartening sight for those with Cartesian anxiety. Thanks to the biological model underneath it, the mind-matter problem is one of a Newtonian transfer (conversion as well as conservation) of energy at all kinds of levels (from photonic energy to nerve impulses, from chemical releases to electrical stimulation, from stimulus to response at the overall organismic level). Both Descartes and Newton can be recognized in the componential explanation of mental functioning (memory, e.g., is typically parsed up in iconic memory, short-term memory, and long-term memory): End-products of mental processing can be exhaustively explained on the basis of interactions between these and other components. Finally of course, information processing is mentalist: It neatly separates res cogitans from resextensa by studying what happens in the mind entirely separately from what happens in the world. The world is a mere adjunct, truly a res extensa, employed solely to lob the next stimulus at the mind (which is where the really interesting processes take place). The information-processing model works marvelously for the simple laboratory experiments that brought it to life. Laboratory studies of perception, decision making, and reaction time reduce stimuli to single snapshots, fired off at the human processing mechanism as one-stop triggers. The Wundtian idea of awareness as a continually flowing phenomenon is artificially reduced, chunked up and frozen by the very stimuli that subjects are to become aware of. Such dehumanization of the settings in which perception takes place, as well as of the models by which such perception comes about, has given rise to considerable critique. If people are seen to be adding meaning to impoverished, elementary stimuli, then this is because they are given impoverished, elementary, meaningless stimuli in their laboratory tasks! None of that says anything about natural perception or the processes by which people perceive or construct meaning in actual settings. The information-processing model may be true (though even that is judged as unlikely by most), but only for the constrained, Spartan laboratory settings that keep cognition in captivity. If people are seen to struggle in their interpretation of elements, then this may have something to do with the elementary stimuli given to them. Even Wundt was not without detractors in this respect. The Gestalt movement was launched in part as a response or protest to Wundtian elementarism. Gestaltists claimed that we actually perceive meaningful wholes, that we immediately experience those wholes. We cannot help but see these patterns, these wholes. Max Wertheimer (1880-1934), one of the founding fathers of Gestaltism, illustrates this as such: "I am standing at the window and see a house, trees, sky. And now, for theoretical purposes, I could try to count and say: there are . . . 327 nuances of brightness [and hue]. Do I see '327'? No; I see sky, house, trees" (Wertheimer, cited in Woods et al., 2002, p. 28). The gestalts that Wertheimer sees (house, trees, sky) are primary to their parts (their elements), and they are more than the sum of their parts.
  • 67. There is an immediate orderliness in experiencing the world. Wertheimer inverts the empiricist claim and information-processing assumption: Rather than meaning being the result of mental operations on elementary stimuli, it actually takes painstaking mental effort (counting 327 nuances of brightness and hue) to reduce primary sensations to their primitive elements. We do not perceive elements: We perceive meaning. Meaning comes effortlessly, prerationally. In contrast, it takes cognitive work to see elements. In the words of William James' senior Harvard colleague Chauncey Wright, there is no antecedent chaos that requires some intrapsychic glue to prevent percepts from falling apart. New Horizons (Again) Empiricism does not recognize the immediate orderliness of experience because it does not see relations as real aspects of immediate experience (Heft, 2001). Relations, according to empiricists, are a product of mental (information) processing. This is true for theories of situation awareness. For them, relations between elements are mental artifacts. They get imposed through stages of processing. Subsequent levels of SA add relationships to elements by linking those elements to current meanings and future projections. The problem of the relationship between matter and mind is not at all solved through empiricist responses. But perhaps engineers and designers, as well as many experimental psychologists, are happy to hear about elements (or 327 nuances of brightness and hue), for those can be manipulated in a design prototype and experimentally tested on subjects. Wundt would have done the same thing. Not unlike Wundt 100 years before him, Ulrich Neisser warned in 1976 that psychology was not quite ready for grand questions about consciousness. Neisser feared that models of cognition would treat consciousness as if it were just a particular stage of processing in a mechanical flow of information. His fears were justified in the mid-1970s, as many psychological models did exactly that. Now they have done it again. Awareness, or consciousness, is equated to a stage of processing along an intrapsychic pathway (levels of SA). As Neisser pointed out, this is an old idea in psychology. The three levels of SA in vogue today were anticipated by Freud, who even supplied flowcharts and boxes in his Interpretation of Dreams to map the movements from unconscious (level 1) to preconscious (level 2) to conscious (level 3). The popularity of finding a home, a place, a structure for consciousness in the head is irresistible, said Neisser, as it allows psychology to nail down its most elusive target (consciousness) to a box in a flowchart. There is a huge cost, though. Along with the deconstruction and mechanization of mental phenomena comes their dehuminization. Information-processing theories have lost much of their appeal and credibility. Many realize how they have corrupted the spirit of the postbehaviorist cognitive revolution by losing sight of humanity and meaning making. Empiricism (or British empiricism) has slipped into history as a school of thought at the beginning of psychological theorizing. Yet both form legitimate offshoots in current understandings of situation awareness. Notions similar to those of empiricism and information processing are reinvented under new guises, which reintroduces the same type of foundational problems, while leaving some of the really hard problems unaddressed. The problem of the nature of stimuli is one of those, and associated with it is the problem of meaning making. How does the mind make sense of those stimuli? Is meaning the end-product of a processing pathway that flows from periphery to center? These are enormous problems in the history of psychology, all of them problems of the relationship between mind and matter, and all essentially still unresolved. Perhaps they are fundamentally unsolvable within the dualist tradition that psychology inherited from Descartes. Some movements in human factors are pulling away from the experimental psychological domination. The idea of distributed cognition has renewed the status of the environment as active, constituent participant in cognitive processes, closing the gap between res cogitans and res extensa. Other people, artifacts, and even body parts are part of the res cogitans. How is it otherwise that a child learns to count on his hand, or a soccer player thinks with her feet? Concomitant interest in cognitive work analysis and cognitive systems engineering see such joint cognitive systems as units of analysis, not the constituent human or machine components. Qualitative methods such as ethnography are increasingly legitimate, and critical for understanding distributed cognitive systems. These movements have triggered and embodied what has now become known as the second cognitive revolution, recapturing and rehabilitating the impulses that brought to life the first. How do people make meaning? In order to begin to answer such aboriginal questions, it is now increasingly held as justifiable and necessary to throw the human factors net wider than experimental psychology. Other forms of social inquiry can shed more light on how we are goal-driven creatures in actual, dynamic environments, not passive recipients of snapshot stimuli in a sterile laboratory.
  • 68. The concerns of these thinkers overlap with functionalist approaches, which formed yet another psychology of protest against Wundt's elementarism. The same protest works equally well against the mechanization of mind by the information-processing school. A century ago, functionalists (William James was one of their foremost exponents) pointed out how people are integrated, living organisms engaged in goal-directed activities, not passive element processors locked into laboratory headrests, buffeted about by one-shot stimuli from an experimental apparatus. The environment in which real activities play out helps shape the organism's responses. Psychological functioning is adaptive: It helps the organism survive and adjust, by incrementally modifying and tweaking its composition or its behavior to generate greater gains on whatever dimension is relevant. Such ecological thinking is now even beginning to seep into approaches to system safety, which has so far also been dominated by mechanistic, structuralist models (see chap. 2). James was not just a functionalist, however. In fact, he was one of the most all-round psychologists ever. His views on radical empiricism are one great way to access novel thinking about situation awareness and sensemaking, and only appropriate against a background of increasing interest in the role of ecological psychology in human factors. RADICAL EMPIRICISM Radical empiricism is one way of circumventing the insurmountable problems associated with psychologies based on dualistic traditions, and William James introduced it as such at the beginning of the 20th century. Radical empiricism rejects the notion of separate mental and material worlds; it rejects dualism. James adhered to an empiricist philosophy, which holds that our knowledge comes (largely) from our discoveries, our experience. But, as Heft (2001) pointed out, James' philosophy is radically empiricist. What is experienced, according to James, is not elements, but relations—meaningful relations. Experienced relations are what perception is made up of. Such a position can account for the orderliness of experience, as it does not rely on subsequent, or a posteriori mental processing. Orderliness is an aspect of the ecology, of our world as we experience it and act in it. The world as an ordered, structured universe is experienced, not constructed through mental work. James dealt with the matter-mind problem by letting the knower and the known coincide during the moment of perception (which itself is a constant, uninterrupted flow, rather than a moment). Ontologies (our being in the world) are characterized by continual transactions between knower and known. Order is not imposed on experience, but is itself experienced. Variations of this approach have always represented a popular countermove in the history of psychology of consciousness. Rather than containing consciousness in a box in the head, it is seen as an aspect of activity. Weick (1995) used the term "enactment" to indicate how people produce the environment they face and are aware of. By acting in the world, people continually create environments that in turn constrain their interpretations, and consequently constrain their next possible actions. This cyclical, ongoing nature of cognition and sensemaking has been recognized by many (see Neisser, 1976) and challenges common interpretations rooted in information processing psychology where stimuli precede meaning making and (only then) action, and where frozen snapshots of environmental status can be taken as legitimate input to the human processing system. Instead, activities of individuals are only partially triggered by stimuli, because the stimulus itself is produced by activity of the individual. This moved Weick (1995) to comment that sensemaking never starts; that people are always in the middle of things. Although we may look back on our own experience as consisting of discrete events, the only way to get this impression is to step out of that stream of experience and look down on it from a position of outsider, or retrospective outsider. It is only possible, really, to pay direct attention to what already exists (that which has already passed): "Whatever is now, at the present moment, under way will determine the meaning of whatever has just occurred" (Weick, p. 27). Situation awareness is in part about constructing a plausible story of the process by which an outcome came about, and the reconstruction of immediate history probably plays a dominant role in this. Few theories of situation awareness acknowledge this role, actually, instead directing their analytics to the creation of meaning from elements and the future projection of that meaning. Radical empiricism does not take the stimulus as its starting point, as does information processing, and neither does it need a-flosteriori processes (mental, representational) to impose orderliness on sensory impressions. We already experience orderliness and relationships through ongoing, goal-oriented transactions of acting and perceiving. Indeed what we experience during perception is not some cognitive end product in the head. Neisser reminded us of this longstanding issue in 1976 too: Can it be true that we see our own retinal images? The theoretical distance that then needs to be bridged is too large. For if we see that retinal image, who does the looking? Homunculus explanations were unavoidable (and often still are in information processing). Homunculi do not solve the problem of awareness, they simply relocate it.
  • 69. Rather than a little man in our heads looking at what we are looking at, we ourselves are aware of the world, and its structure, in the world. As Edwin Holt put it, "Consciousness, whenever it is localized at all in space, is not in the skull, but is 'out there' precisely where it appears to be" (cited in Heft, 2001, p. 59). James, and the entire ecological school after him, anticipated this. What is perceived, according to James, is not a replica, not a simile of something out there. What is perceived is already out there. There are no intermediaries between perceiver and percept; perceiving is direct. This position forms the groundwork of ecological approaches in psychology and human factors. If there is no separation between matter and mind, then there is no gap that needs bridging; there is no need for reconstructive processes in the mind that make sense of elementary stimuli. The Venn diagram with a little and a larger circle that depict actual and ideal situation awareness is superfluous too. Radical empiricism allows human factors to stick closer to the anthropologist's ideal of describing and capturing insider accounts. If there is no separation between mind and matter, between actual and ideal situation awareness, then there is no risk of getting trapped in judging performance by use of extrogenous criteria; criteria imported from outside the setting (informed by hindsight or some other source of omniscience about the situation that opens up that delta, or gap, between what the observer inside the situation knew and what the researcher knows). What the observer inside the situation knows must be seen as canonical it must be understood not in relation to some normative ideal. For the radical empiricist, there would not be two circles in the Venn diagram, but rather different rationalities, different understandings of the situation none of them right or wrong or necessarily better or worse, but all of them coupled directly to the interests, expectations, knowledge, and goals of the respective observer. DRIFTING OFF TRACK: REVISITING A CASE OF "LOST SITUATION AWARENESS" Let us go back to the Royal Majesty. Traditionalist ideas about a lack of correspondence between a material and a mental world get a boost from this sort of case. A crew ended up 17 miles off track, after a day and a half of sailing. How could this happen? As said before, hindsight makes it easy to see where people were, versus where they thought they were. In hindsight, it is easy to point to the cues and indications that these people should have picked up in order to update or correct or even form their understanding of the unfolding situation around them. Hindsight has a way of exposing those elements that people missed, and a way of amplifying or exaggerating their importance. The key question is not why people did not see what we now know was important. The key question is how they made sense of the situation the way they did. What must the crew in question at the time have seen? How could they, on the basis of their experiences, construct a story that was coherent and plausible? What were the processes by which they became sure that they were right about their position? Let us not question the accuracy of the insider view. Research into situation awareness already does enough of that. Instead, let us try to understand why that insider view was plausible for people at the time, why it was, in fact, the only possible view. Departure From Bermuda The Royal Majesty departed Bermuda, bound for Boston at 12:00 noon on the 9th of June 1995. The visibility was good, the winds light, and the sea calm. Before departure the navigator checked the navigation and communication equipment. He found it in "perfect operating condition." About half an hour after departure the harbor pilot disembarked and the course was set toward Boston. Just before 13:00 there was a cutoff in the signal from the GPS (Global Positioning System) antenna, routed on the fly bridge (the roof of the bridge), to the receiver—leaving the receiver without satellite signals. Postaccident examination showed that the antenna cable had separated from the antenna connection. When it lost satellite reception, the GPS promptly defaulted to dead reckoning (DR) mode. It sounded a brief aural alarm and displayed two codes on its tiny display: DR and SOL. These alarms and codes were not noticed. (DR means that the position is estimated, or deduced, hence "ded," or now "dead," reckoning. SOL means that satellite positions cannot be calculated.) The ship's autopilot would stay in DR mode for the remainder of the journey. Why was there a DR mode in the GPS in the first place, and why was a default to that mode neither remarkable, nor displayed in a more prominent way on the bridge? When this particular GPS receiver was manufactured (during the 1980s), the GPS satellite system was not as reliable as it is today. The receiver could, when satellite data was unreliable, temporarily use a DR mode in which it estimated positions using an initial position, the gyrocompass for course input and a log for speed input. The GPS thus had two modes, normal and DR. It switched autonomously between the two depending on the accessibility of satellite signals.
  • 70. By 1995, however, GPS satellite coverage was pretty much complete, and had been working well for years. The crew did not expect anything out of the ordinary. The GPS antenna was moved in February, because parts of the superstructure occasionally would block the incoming signals, which caused temporary and short (a few minutes, according to the captain) periods of DR navigation. This was to a great extent remedied by the antenna move, as the Cruise Line's electronics technician testified. People on the bridge had come to rely on GPS position data and considered other systems to be backup systems. The only times the GPS positions could not be counted on for accuracy were during these brief, normal episodes of signal blockage. Thus, the whole bridge crew was aware of the DR-mode option and how it worked, but none of them ever imagined or were prepared for a sustained loss of satellite data caused by a cable break—no previous loss of satellite data had ever been so swift, so absolute, and so long lasting. When the GPS switched from normal to DR on this journey in June 1995, an aural alarm sounded and a tiny visual mode annunciation appeared on the display. The aural alarm sounded like that of a digital wristwatch and was less than a second long. The time of the mode change was a busy time (shortly after departure), with multiple tasks and distractors competing for the crew's attention. A departure involves complex maneuvering, there are several crew members on the bridge, and there is a great deal of communication. When a pilot disembarks, the operation is time constrained and risky. In such situations, the aural signal could easily have been drowned out. No one was expecting a reversion to DR mode, and thus the visual indications were not seen either. From the insider perspective, there was no alarm, as there was not going to be a mode default. There was neither a history, nor an expectation of its occurrence. Yet even if the initial alarm was missed, the mode indication was continuously available on the little GPS display. None of the bridge crew saw it, according to their testimonies. If they had seen it, they knew what it meant, literally translated dead reckoning means no satellite fixes. But as we saw before, there is a crucial difference between data that in hindsight can be shown to have been available and data that were observable at the time. The indications on the little display (DR and SOL) were placed between two rows of numbers (representing the ship's latitude and longitude) and were about one sixth the size of those numbers. There was no difference in the size and character of the position indications after the switch to DR. The size of the display screen was about 7.5 by 9 centimeters, and the receiver was placed at the aft part of the bridge on a chart table, behind a curtain. The location is reasonable, because it places the GPS, which supplies raw position data, next to the chart, which is normally placed on the chart table. Only in combination with a chart do the GPS data make sense, and furthermore the data were forwarded to the integrated bridge system and displayed there (quite a bit more prominently) as well. For the crew of the Royal Majesty, this meant that they would have to leave the forward console, actively look at the display, and expect to see more than large digits representing the latitude and longitude. Even then, if they had seen the two-letter code and translated it into the expected behavior of the ship, it is not a certainty that the immediate conclusion would have been "this ship is not heading towards Boston anymore," because temporary DR reversions in the past had never led to such dramatic departures from the planned route. When the officers did leave the forward console to plot a position on the chart, they looked at the display and saw a position, and nothing but a position, because that is what they were expecting to see. It is not a question of them not attending to the indications. They were attending to the indications, the position indications, because plotting the position it is the professional thing to do. For them, the mode change did not exist. But if the mode change was so nonobservable on the GPS display, why was it not shown more clearly somewhere else? How could one small failure have such an effect—were there no backup systems? The Royal Majesty had a modern integrated bridge system, of which the main component was the navigation and command system (NACOS). The NACOS consisted of two parts, an autopilot part to keep the ship on course and a map construction part, where simple maps could be created and displayed on a radar screen. When the Royal Majesty was being built, the NACOS and the GPS receiver were delivered by different manufacturers, and they, in turn, used different versions of electronic communication standards. Due to these differing standards and versions, valid position data and invalid DR data sent from the GPS to the NACOS were both labeled with the same code (GP). The installers of the bridge equipment were not told, nor did they expect, that (GP-labeled) position data sent to the NACOS would be anything but valid position data. The designers of the NACOS expected that if invalid data were received, they would have another format. As a result, the GPS used the same data label for valid and invalid data, and thus the autopilot could not distinguish between them. Because the NACOS could not detect that the GPS data was invalid, the ship sailed on an autopilot that was using estimated positions until a few minutes before the grounding.
  • 71. A principal function of an integrated bridge system is to collect data such as depth, speed, and position from different sensors, which are then shown on a centrally placed display to provide the officer of the watch with an overview of most of the relevant information. The NACOS on the Royal Majesty was placed at the forward part of the bridge, next to the radar screen. Current technological systems commonly have multiple levels of automation with multiple mode indications on many displays. An adaptation of work strategy is to collect these in the same place and another solution is to integrate data from many components into the same display surface. This presents an integration problem for shipping in particular, where quite often components are delivered by different manufacturers. The centrality of the forward console in an integrated bridge system also sends the implicit message to the officer of the watch that navigation may have taken place at the chart table in times past, but the work is now performed at the console. The chart should still be used, to be sure, but only as a backup option and at regular intervals (customarily every half-hour or every hour). The forward console is perceived to be a clearing house for all the information needed to safely navigate the ship. As mentioned, the NACOS consisted of two main parts. The GPS sent position data (via the radar) to the NACOS in order to keep the ship on track (autopilot part) and to position the maps on the radar screen (map part). The autopilot part had a number of modes that could be manually selected: NAV and COURSE. NAV mode kept the ship within a certain distance of a track, and corrected for drift caused by wind, sea, and current. COURSE mode was similar but the drift was calculated in an alternative way. The NACOS also had a DR mode, in which the position was continuously estimated. This backup calculation was performed in order to compare the NACOS DR with the position received from the GPS. To calculate the NACOS DR position, data from the gyrocompass and Doppler log were used, but the initial position was regularly updated with GPS data. When the Royal Majesty left Bermuda, the navigation officer chose the NAV mode and the input came from the GPS, normally selected by the crew during the 3 years the vessel had been in service. If the ship had deviated from her course by more than a preset limit, or if the GPS position had differed from the DR position calculated by the autopilot, the NACOS would have sounded an aural and clearly shown a visual alarm at the forward console (the position-fix alarm). There were no alarms because the two DR positions calculated by the NACOS and the GPS were identical. The NACOS DR, which was the perceived backup, was using GPS data, believed to be valid, to refresh its DR position at regular intervals. This is because the GPS was sending DR data, estimated from log and gyro data, but labeled as valid data. Thus, the radar chart and the autopilot were using the same inaccurate position information and there was no display or warning of the fact that DR positions (from the GPS) were used. Nowhere on the integrated display could the officer on watch confirm what mode the GPS was in, and what effect the mode of the GPS was having on the rest of the automated system, not to mention the ship. In addition to this, there were no immediate and perceivable effects on the ship because the GPS calculated positions using the log and the gyrocompass. It cannot be expected that a crew should become suspicious of the fact that the ship actually is keeping her speed and course. The combination of a busy departure, an unprecedented event (cable break) together with a nonevent (course keeping), and the change of the locus of navigation (including the intrasystem communication difficulties) shows that it made sense, in the situation and at the time, that the crew did not know that a mode change had occurred. The Ocean Voyage Even if the crew did not know about a mode change immediately after departure, there was still a long voyage at sea ahead. Why did none of the officers check the GPS position against another source, such as the Loran- C receiver that was placed close to the GPS? (Loran-C is a radio navigation system that relies on land-based transmitters.) Until the very last minutes before the grounding, the ship did not act strangely and gave no reason for suspecting that anything was amiss. It was a routine trip, the weather was good and the watches and watch changes uneventful. Several of the officers actually did check the displays of both Loran and GPS receivers, but only used the GPS data (because those had been more reliable in their experience) to plot positions on the paper chart. It was virtually impossible to actually observe the implications of a difference between Loran and GPS numbers alone. Moreover, there were other kinds of cross-checking. Every hour, the position on the radar map was checked against the position on the paper chart, and cues in the world (e.g., sighting of the first buoy) were matched with GPS data. Another subtle reassurance to officers must have been that the master on a number of occasions spent several minutes checking the position and progress of the ship, and did not make any corrections. Before the GPS antenna was moved, the short spells of signal degradation that lead to DR mode also caused the radar map to jump around on the radar screen (the crew called it "chopping") because the position would change erratically. The reason chopping was not observed on this particular occasion was that the position did not change erratically, but in a manner consistent with dead reckoning. It is entirely possible that the satellite signal was lost before the autopilot was switched on, thus causing no shift in position. The crew had developed a strategy to deal with this occurrence in the past.
  • 72. When the position-fix alarm sounded, they first changed modes (from NAV to COURSE) on the autopilot and then they acknowledged the alarm. This had the effect of stabilizing the map on the radar screen so that it could be used until the GPS signal returned. It was an unreliable strategy, because the map was being used without knowing the extent of error in its positioning on the screen. It also led to the belief that, as mentioned earlier, the only time the GPS data were unreliable was during chopping. Chopping was more or less alleviated by moving the antenna, which means that by eliminating one problem a new pathway for accidents was created. The strategy of using the position-fix alarm as a safeguard no longer covered all or most of the instances of GPS unreliability. This locally efficient procedure would almost certainly not be found in any manuals, but gained legitimacy through successful repetition becoming common practice over time. It may have sponsored the belief that a stable map is a good map, with the crew concentrating on the visible signs instead of being wary of the errors hidden below the surface. The chopping problem had been resolved for about 4 months, and trust in the automation grew. First Buoy to Grounding Looking at the unfolding sequence of events from the position of retrospective outsider, it is once again easy to point to indications missed by the crew. Especially toward the end of the journey, there appears to be a larger number of cues that could potentially have revealed the true nature of the situation. There was an inability of the first officer to positively identify the first buoy that marked the entrance of the Boston sea lanes (such lanes form a separation scheme delineated on the chart to keep meeting and crossing traffic at a safe distance and to keep ships away from dangerous areas) . A position error was still not suspected, even with the vessel close to the shore. The lookouts reported red lights and later blue and white water, but the second officer did not take any action. Smaller ships in the area broadcasted warnings on the radio, but nobody on the bridge of the Royal Majesty interpreted those to concern their vessel. The second officer failed to see the second buoy along the sea lanes on the radar, but told the master that it had been sighted. In hindsight, there were numerous opportunities to avoid the grounding, which the crew consistently failed to recognize Such conclusions are based on a dualist interpretation of situation awareness. What matters to such an interpretation is the accuracy of the mapping between an external world that can be pieced together in hindsight (and that contains shopping bags full of epiphanies never opened by those who needed them most) and people's internal representation of that world. This internal representation (or situation awareness) can be shown to be clearly deficient, as falling far short of all the cues that were available. But making claims about the awareness of other people at another time and place requires us to put ourselves in their shoes and limit ourselves to what they knew. We have to find out why people thought they were in the right place, or had the right assessment of the situation around them. What made that so? Remember, the adequacy or accuracy of an insider's representation of the situation cannot be called into question: It is what counts for them, and it is what drives further action in that situation. Why was it plausible for the crew to conclude that they were in the right place? What did their world look like to them (not: How does it look now to retrospective observers)? The first buoy ("BA") in the Boston traffic lanes was passed at 19:20 on the 10th of June, or so the chief officer thought (the buoy identified by the first officer as the BA later turned out to be the "AR" buoy located about 15 miles to the west-southwest of the BA). To the chief officer, there was a buoy on the radar, and it was where he expected it to be, it was where it should be. It made sense to the first officer to identify it as the correct buoy because the echo on the radar screen coincided with the mark on the radar map that signified the BA. Radar map and radar world matched. We now know that the overlap between radar map and radar return was a mere stochastic fit. The map showed the BA buoy, and the radar showed a buoy return. A fascinating coincidence was the sun glare on the ocean surface that made it impossible to visually identify the BA. But independent cross-checking had already occurred: The first officer probably verified his position by two independent means, the radar map and the buoy. The officer, however, was not alone in managing the situation, or in making sense of it. An interesting aspect of automated navigation systems in real workplaces is that several people typically use it, in partial overlap and consecutively, like the watch-keeping officers on a ship. At 20:00 the second officer took over the watch from the chief officer. The chief officer must have provided the vessel's assumed position, as is good watch-keeping practice. The second officer had no reason to doubt that this was a correct position. The chief officer had been at sea for 21 years, spending 30 of the last 36 months onboard the Royal Majesty. Shortly after the takeover, the second officer reduced the radar scale from 12 to 6 nautical miles. This is normal practice when vessels come closer to shore or other restricted waters. By reducing the scale, there is less clutter from the shore, and an increased likelihood of seeing anomalies and dangers. When the lookouts later reported lights, the second officer had no expectation that there was anything wrong. To him, the vessel was safely in the traffic lane. Moreover, lookouts are liable to report everything indiscriminately; it is always up to the officer of the watch to decide whether to take action.
  • 73. There is also a cultural and hierarchical gradient between the officer and the lookouts; they come from different nationalities and backgrounds. At this time, the master also visited the bridge and, just after he left, there was a radio call. This escalation of work may well have distracted the second officer from considering the lookouts' report, even if he had wanted to. After the accident investigation was concluded, it was discovered that two Portuguese fishing vessels had been trying to call the Royal Majesty on the radio to warn her of the imminent danger. The calls were made not long before the grounding, at which time the Royal Majesty was already 16.5 nautical miles from where the crew knew her to be. At 20:42, one of the fishing vessels called, "fishing vessel, fishing vessel call cruise boat," on channel 16 (an international distress channel for emergencies only). Immediately following this first call in English the two fishing vessels started talking to each other in Portuguese. One of the fishing vessels tried to call again a little later, giving the position of the ship he was calling. Calling on the radio without positively identifying the intended receiver can lead to mix-ups. In this case, if the second officer heard the first English call and the ensuing conversation, he most likely disregarded it since it seemed to be two other vessels talking to each other. Such an interpretation makes sense: If one ship calls without identifying the intended receiver, and another ship responds and consequently engages the first caller in conversation, the communication loop is closed. Also, as the officer was using the 6- mile scale, he could not see the fishing vessels on his radar. If he had heard the second call and checked the position, he might well have decided that the call was not for him, as it appeared that he was far from that position. Whomever the fishing ships were calling, it could not have been him, because he was not there. At about this time, the second buoy should have been seen and around 21:20 it should have been passed, but was not. The second officer assumed that the radar map was correct when it showed that they were on course. To him the buoy signified a position, a distance traveled in the traffic lane, and reporting that it had been passed may have amounted to the same thing as reporting that they had passed the position it was (supposed to have been) in. The second officer did not, at this time, experience an accumulation of anomalies, warning him that something was going wrong. In his view, this buoy, which was perhaps missing or not picked up by the radar, was the first anomaly, but not perceived as a significant one. The typical Bridge Procedures Guide says that a master should be called when (a) something unexpected happens, (b) when something expected does not happen (e.g., a buoy), and (c) at any other time of uncertainty. This is easier to write than it is to apply in practice, particularly in a case where crew members do not see what they expected to see. The NTSB report, in typical counterfactual style, lists at least five actions that the officer should have taken. He did not take any of these actions, because he was not missing opportunities to avoid the grounding. He was navigating the vessel safely to Boston. The master visited the bridge just before the radio call, telephoned the bridge about 1 hour after it, and made a second visit around 22:00. The times at which he chose to visit the bridge were calm and uneventful, and did not prompt the second officer to voice any concerns, nor did they trigger the master's interest in more closely examining the apparendy safe handling of the ship. Five minutes before the grounding, a lookout reported blue and white water. For the second officer, these indications alone were no reason for taking action. They were no warnings of anything about to go amiss, because nothing was going to go amiss. The crew knew where they were. Nothing in their situation suggested to them that they were not doing enough or that they should question the accuracy of their awareness of the situation. At 22:20 the ship started to veer, which brought the captain to the bridge. The second officer, still certain that they were in the traffic lane, believed that there was something wrong with the steering. This interpretation would be consistent with his experiences of cues and indications during the trip so far. The master, however, came to the bridge and saw the situation differently, but was too late to correct the situation. The Royal Majesty ran aground east of Nantucket at 22:25, at which time she was nautical miles from her planned and presumed course. None of the over 1,000 passengers were injured, but repairs and lost revenues cost the company $7 million. With a discrepancy of 17 miles at the premature end to the journey of the Royal Majesty, and a day and a half to discover the growing gap between actual and intended track, the case of loss of SA, or deficient SA, looks like it is made. But the supposed elements that make up all the cues and indications that the crew should have seen, and should have understood, are mostly products of hindsight, products of our ability to look at the unfolding sequence of events from the position of retrospective outsiders. In hindsight, we wonder how these repeated "opportunities to avoid the grounding," these repeated invitations to undergo some kind of epiphany about the real nature of the situation, were never experienced by the people who needed them most. But the revelatory nature of the cues, as well as the structure or coherence that they apparently have in retrospect, are not products of the situation itself or the actors in it.
  • 74. They are retrospective imports. When looked at from the position of retrospective outsider, the deficient SA can look so very real, so compelling. They failed to notice, they did not know, they should have done this or that. But from the point of view of people inside the situation, as well as potential other observers, these deficiencies do not exist in and of themselves; they are artifacts of hindsight, elements removed retrospectively from a stream of action and experience. To people on the inside, it is often nothing more than normal work. If we want to begin to understand why it made sense for people to do what they did, we have to put ourselves in their shoes. What did they know? What was their understanding of the situation? Rather than construing the case as a loss of SA (which simply judges other people for not seeing what we, in our retrospective omniscience, would have seen), there is more explanatory leverage in seeing the crew's actions as normal processes of sensemaking of transactions between goals, observations, and actions. As Weick (1995) pointed out, sensemaking is something that preserves plausibility and coherence, something that is reasonable and memorable, something that embodies past experience and expectations, something that resonates with other people, something that can be constructed retrospectively but also can be used prospectively, something that captures both feeling and thought ... In short, what is necessary in sensemaking is a good story. A good story holds disparate elements together long enough to energize and guide action, plausibly enough to allow people to make retrospective sense of whatever happens, and engagingly enough that others will contribute their own inputs in the interest of sensemaking. (p. 61) Even if one does make concessions to the existence of elements, as Weick does, it is only for the role they play in constructing a plausible story of what is going on, not for building an accurate mental simile of an external world somewhere "out there." Chapter 6 Why Do Operators Become Complacent? The introduction of powerful automation in a variety of transport applications has increased the emphasis on human cognitive work. Human operators on, for example, ship bridges or aircraft flight decks spend much time integrating data, planning activities, and managing a suite of machine resources in the conduct of their tasks. This shift has contributed to the utility of a concept such as situation awareness. One large term can capture the extent to which operators are in tune with relevant process data and can form a picture of the system and its progress in space and time. As the Royal Majesty example in the chapter 5 showed, most high-tech settings are actually not characterized by a single human interacting with a machine. In almost all cases, multiple people—crews, or teams of operators —-jointly interact with the automated system in pursuit of operational objectives. These crews or teams have to coordinate their activities with those of the system in order to achieve common goals. Despite the weight that crews (and human factors researchers) repeatedly attribute to having a shared understanding of their system state and problems to be solved, consensus in transportation human factors on a concept of crew situation awareness seems far off. It appears that various labels are used interchangeably to refer to the same basic phenomenon, for example, group situation awareness, shared problem models, team situation awareness, mutual knowledge, shared mental models, joint situation awareness, and shared understanding. At the same time, results about what constitutes the phenomenon are fragmented and ideas on how to measure it remain divided. Methods to gain empirical access range from modified measures of practitioner expertise, to questionnaires interjected into suddenly frozen simulation scenarios, to implicit probes embedded in unfolding simulations of natural task behavior. Most critically, however, a common definition or model of crew situation awareness remains elusive. There is human factors research, for example, that claims to identify links between crew situation awareness and other parameters (such as planning or crew-member roles). But such research often does not mention a definition of the phenomenon. This renders empirical demonstrations of the phenomenon unverifiable and inconclusive. After all, how can a researcher claim that he or she saw something if that something was not defined? Perhaps there is no need to define the phenomenon, because everybody knows what it means. Indeed, situation awareness is what we call a folk model. It has come up from the practitioner community (fighter pilots in this case) to indicate the degree of coupling between human and environment. Folk models are highly useful because they can collapse complex, multidimensional problems into simple labels that everybody can relate to. But this is also where the risks lie, certainly when researchers pick up on a folk label and attempt to investigate and model it scientifically. Situation awareness is not alone in this. Human factors today has more concepts that aim to provide insight into the human performance issues that underlie complex behavioral sequences. It is often tempting to mistake the labels themselves for deeper insight -something that is becoming increasinglycommon in, for example, accident analyses. Thus loss of situation awareness, automation complacency and loss of effective crew resource management can now be found among the causal factors and conclusions in accident reports.
  • 75. This happens without further specification of the psychological mechanisms responsible for the observed behavior—much less how such mechanisms or behavior could have forced the sequence of events toward its eventual outcome. The labels (modernist replacements of the old pilot error) are used to refer to concepts that are intuitively meaningful. Everyone is assumed to understand or implicitly agree on them, yet no effort is usually made to explicate or reach agreement on the underlying mechanisms or precise definitions. People may no longer dare to ask what these labels mean, lest others suspect they are not really initiated in the particulars of their business. Indeed, large labels that correspond roughly to mental phenomena we know from daily life are deemed sufficient—they need no further explanation. This is often accepted practice for psychological phenomena because as humans we all have privileged knowledge about how the mind works (because we all have one). However, a verifiable and detailed mapping between the context-specific (and measurable) particulars of a behavior on the one hand and a concept-dependent model on the other is not achieved—the jump from context specifics (somebody flying into a mountainside) to concept dependence (the operator must have lost SA) is immune to critique or verification. Folk models are not necessarily incorrect, but compared to articulated models they focus on descriptions rather than explanations, and they are very hard to prove wrong. Folk models are pervasive in the history of science. One well-known example of a folk model from modern times is Freud's psychodynamic model, which links observable behavior and emotions to nonobservable structures (id, ego, superego) and their interactions. One feature of folk models is that nonobservable constructs are endowed with the necessary causal power without much specification of the mechanism responsible for such causation. According to Kern (1998), for example, complacency can cause a loss of situation awareness. In other words, one folk problem causes another folk problem. Such assertions leave few people any wiser. Because both folk problems are constructs postulated by outside observers (and mostly post hoc), they cannot logically cause anything in the empirical world. Yet this is precisely what they are assumed to be capable of. In wrapping up a conference on situation awareness, Charles Billings warned against this danger in 1996: The most serious shortcoming of the situation awareness construct as we have thought about it to date, however, is that it's too neat, too holistic and too seductive. We heard here that deficient SA was a causal factor in many airline accidents associated with human error. We must avoid this trap: deficient situation awareness doesn't "cause" anything. Faulty spatial perception, diverted attention, inability to acquire data in the time available, deficient decisionmaking, perhaps, but not a deficient abstraction! (p. 3) What Billings did not mention is that "diverted attention" and "deficient decision- making" themselves are abstractions at some level (and post hoc ones at that). They are nevertheless less contentious because they provide a reasonable level of detail in their description of the psychological mechanisms that account for their causation. Situation awareness is too "neat" and "holistic" in the sense that it lacks such a level of detail and thus fails to account for a psychological mechanism that would connect features of the sequence of events to the outcome. The folk model, however, was coined precisely because practitioners (pilots) wanted something "neat" and "holistic" that could capture critical but inexplicit aspects of their performance in complex, dynamic situations. We have to see their use of a folk model as legitimate. It can fulfill a useful function with respect to the concerns and goals of a user community. This does not mean that the concepts coined by users can be taken up and causally manipulated by scientists without serious foundational analysis and explication of their meaning. Resisting the temptation, however, can be difficult. After all, human factors is a discipline that lives by its applied usefulness. If the discipline does not generate anything of interest to applied communities, then why would they bother funding the work? In this sense, folk models can seem like a wonderfully convenient bridge between basic and applied worlds, between scientific and practitioner communities. Terms like situation awareness allow both camps to speak the same language. But such conceptual sharing risks selling out to superficial validity. It may not do human factors a lot of good in the long run, nor may it really benefit the practitioner consumers of research results. Another folk concept is complacency. Why does people's vigilance decline over time, especially when confronted with repetitive stimuli? Vigilance decrements have formed an interesting research problem ever since the birth of human factors during and just after the Second World War. The idea of complacency has always been related to vigilance problems. Although complacency connotes something motivational (people must ensure that they watch the process carefully), the human factors literature actually has little in the way of explanation or definition. What is complacency? Why does it occur? If you want answers to these questions, do not turn to the human factors literature. You will not find answers there.
  • 76. Complacency is one of those constructs, whose meaning is assumed to be known by everyone. This justifies taking it up in scientific discourse as something that can be manipulated or studied as an independent or dependent variable, without having to go through the bother of defining what it actually is or how it works. In other words, complacency makes a "neat" and "holistic" case for studying folk models. DEFINITION BY SUBSTITUTION The most evident characteristic of folk models is that they define their central constructs by substitution rather than decomposition. A folk concept is explained simply by referring to another phenomenon or construct that itself is in equal need of explanation. Substitution is not the same as decomposition: Substituting replaces one high-level label with another, whereas decomposition takes the analysis down into subsequent levels of greater detail, which transform the high-level concept into increasingly measurable context specifics. A good example of definition by substitution is the label complacency, in relation to the problems observed on automated flight decks. Most textbooks on aviation human factors talk about complacency and even endow it with causal power, but none really define (i.e., decompose) it: • According to Wiener (1988, p. 452), "boredom and complacency are often mentioned" in connection with the out-of-the-loop issue in automated cockpits. But whether complacency causes an out-of-the-loop condition or whether it is the other way around is left unanswered. • O'Hare and Roscoe (1990, p. 117) stated that "because autopilots have proved extremely reliable, pilots tend to become complacent and fail to monitor them." Complacency, in other words, is invoked to explain monitor failures. • Kern (1998, p. 240) maintained that "as pilots perform duties as system monitors they will be lulled into complacency, lose situational awareness, and not be prepared to react in a timely manner when the system fails." Thus, complacency can cause a loss of situational awareness. But how this occurs is left to the imagination. • On the same page in their textbook, Campbell and Bagshaw (1991, p. 126) said that complacency is both a "trait that can lead to a reduced awareness of danger," and a "state of confidence plus contentment" (emphasis added). In other words, complacency is at the same time a long-lasting, enduring feature of personality (a trait) and a shorter lived, transient phase in performance (a state). • For the purpose of categorizing incident reports, Parasuraman, Molly, and Singh (1993, p. 3) defined complacency as: "self-satisfaction which may result in non-vigilance based on an unjustified assumption of satisfactory system state." This is part definition but also part substitution: Self-satisfaction takes the place of complacency and is assumed to speak for itself. There is no need to make explicit by which psychological mechanism selfsatisfaction arises or how it produces nonvigilance. It is in fact difficult to find real content on complacency in the human factors literature. The phenomenon is often described or mentioned in relation to some deviation or diversion from official guidance (people should coordinate, double-check, look but they do not), which is both normativist and judgmental. The "unjustified assumption of satisfactory system state" in Parasuraman et al.'s (1993) definition is emblematic for human factors' understanding of work by reference to externally dictated norms. If we want to understand complacency, the whole point is to analyze why the assumption of satisfactory system state is justified (not unjustified) by those who are making that assumption. If it were unjustified, and they knew that, they would not make the assumption and would consequently not become complacent. Saying that an assumption of satisfactory system state is unjustified (but people still keep making it—they must be motivationally deficient) does not explain much at all. None of the above examples really provide a definition of complacency. Instead, complacency is treated as self-evident (everybody knows what it means, right?) and thus it can be defined by substituting one label for another. The human factors literature equates complacency with many different labels, including boredom, overconfidence, contentment, unwarranted faith, overreliance, self- satisfaction, and even a low index of suspicion. So if we would ask, "What do you mean by 'complacency'?," and the reply is, "Well, it is self-satisfaction," we can be expected to say, "Oh, of course, now I understand what you mean." But do we really? Explanation by substitution actually raises more questions than it answers. By failing to propose an articulated psychological mechanism responsible for the behavior observed, we are left to wonder. How is it that complacency produces vigilance decrements or how is it that complacency leads to a loss of situation awareness? The explanation could be a decay of neurological connections, fluctuations in learning and motivation, or a conscious trade-off between competing goals in a changing environment. Such definitions, which begin to operationalize the large concept of complacency, suggest possible probes that a researcher could use to monitor for the target effect. But because none of the descriptions of complacency available today offer any such roads to insight, claims that complacency was at the heart of a sequence of events are immune to critique and falsification.
  • 77. IMMUNITY AGAINST FALSIFICATION Most philosophies of science rely on the empirical world as touchstone or ultimate arbiter (a reality check) for postulated theories. Following Popper's rejection of the inductive method in the empirical sciences, theories and hypotheses can only be deductively validated by means of falsifiability. This usually involves some form of empirical testing to look for exceptions to the postulated hypothesis, where the absence of contradictory evidence becomes corroboration of the theory. Falsification deals with the central weakness of the inductive method of verification, which, as pointed out by David Hume, requires an infinite number of confirming empirical demonstrations. Falsification, on the other hand, can work on the basis of only one empirical instance, which proves the theory wrong. As seen in chapter 3, this is of course a highly idealized, almost clinical conceptualization of the scientific enterprise. Yet, regardless, theories that do not permit falsification at all are highly suspect. The resistance of folk models against falsification is known as immunization. Folk models leave assertions about empirical reality underspecified, without a trace for others to follow or critique. For example, a senior training captain once asserted that cockpit discipline is compromised when any of the following attitudes are prevalent: arrogance, complacency, and overconfidence. Nobody can disagree because the assertion is underspecified and therefore immune against falsification. This is similar to psychoanalysts claiming that obsessive-compulsive disorders are the result of overly harsh toilet training that fixated the individual in the anal stage. In the same vein, if the question of "Where are we headed?" from one pilot to the other is interpreted as a loss of situation awareness (Aeronautica Civil, 1996), this claim is immune against falsification. The journey from context-specific behavior (people asking questions) to the postulated psychological mechanism (loss of situation awareness) is made in one big leap, leaving no trace for others to follow or critique. Current theories of situation awareness are not sufficiently articulated to be able to explain why asking questions about direction represents a loss of situation awareness. Some theories may superficially appear to have the characteristics of good scientific models, yet just below the surface they lack an articulated mechanism that is amenable to falsification. Although falsifiability may at first seem like a self-defeating criterion for scientific progress, the opposite is true: The most falsifiable models are usually also the most informative ones, in the sense that they make stronger and more demonstrable claims about reality. In other words, falsifiability and informativeness are two sides of the same coin. Folk Models Versus Young and Promising Models One risk in rejecting folk models is that the baby is thrown out with the bath water. In other words, there is the risk of rejecting even those models that may be able to generate useful empirical results, if only given the time and opportunity to do so. Indeed, the more articulated human factors constructs (such as decision making, diagnosis) are distinguished from the less articulated ones (situation awareness, complacency) in part by their maturity, by how long they have been around in the discipline. What opportunity should the younger ones receive before being rejected as unproductive? The answer to this question hinges, once again, on falsifiability. Ideal progress in science is described as the succession of theories, each of which is more falsifiable (and thus more informative) than the one before it. Yet when we assess loss of situation awareness or complacency as more novel explanations of phenomena that were previously covered by other explanations, it is easy to see that falsifiability has actually decreased, rather than increased. Take as an example an automation-related accident that occurred when situation awareness or automation-induced complacency did not yet exist in 1973. The aircraft in question was on approach in rapidly changing weather conditions. It was equipped with a slightly deficient flight director (a device on the central instrument panel showing the pilot where to go, based on an unseen variety of sensory inputs), which the captain of the airplane distrusted. The airplane struck a seawall bounding Boston's Logan Airport about 1 kilometer short of the runway and slightly to the side of it, killing all 89 people onboard. In its comment on the crash, the National Transportation Safety Board explained how an accumulation of discrepancies, none critical in themselves, can rapidly deteriorate into a high-risk situation without positive flight management. The first officer, who was flying, was preoccupied with the information presented by his flight-director systems, to the detriment of his attention to altitude, heading and airspeed control (NTSB, 1974). Today, both automation-induced complacency of the first officer and a loss of situation awareness of the entire crew could likely be cited under the causes of this crash. (Actually, that the same set of empirical phenomena can comfortably be grouped under either label—complacency or loss of situation awareness—is additional testimony to the undifferentiated and underspecified nature of these concepts.)
  • 78. These supposed explanations (complacency, loss of situation awareness) were obviously not needed in 1974 to deal with this accident. The analysis left us instead with more detailed, more falsifiable, and more traceable assertions that linked features of the situation (e.g., an accumulation of discrepancies) with measurable or demonstrable aspects of human performance (diversion of attention to the flight director vs. other sources of data). The decrease of falsifiability represented by complacency and situation awareness as hypothetical contenders in explaining this crash represents the inverse of scientific progress, and therefore argues for the rejection of such novel concepts. OVERGENERALIZATION The lack of specificity of folk models and the inability to falsify them contribute to their overgeneralization. One famous example of overgeneralization in psychology is the inverted-U curve, also known as the Yerkes- Dodson law. Ubiquitous in human factors textbooks, the inverted-U curve couples arousal with performance (without clearly stating any units of either arousal or performance), where a person's best performance is claimed to occur between too much arousal (or stress) and too little, tracing a sort of hyperbole. The original experiments were, however, neither about performance nor about arousal (Yerkes & Dodson, 1908). They were not even about humans. Examining "the relation between stimulus strength and habit formation," the researchers subjected laboratory rats to electrical shocks to see how quickly they decided to take a particular pathway versus another. The conclusion was that rats learn best (that is, they form habits most rapidly) at any but the highest or lowest shock. The results approximated an inverted U only with a most generous curve fitting, the x axis was never defined in psychological terms but in terms of shock strength, and even this was confounded: Yerkes and Dodson used different levels of shock which were too poorly calibrated to know how different they really were. The subsequent overgeneralization of the Yerkes-Dodson results (to no fault of their own, incidentally) has confounded stress and arousal, and after a century there is still little evidence that any kind of inverted-U relationship holds for stress (or arousal) and human performance. Overgeneralizations take narrow laboratory findings and apply them uncritically to any broad situation where behavioral particulars bear some prima-facie resemblance to the phenomenon that was investigated under controlled circumstances. Other examples of overgeneralization and overapplication include perceptual tunneling (putatively championed by the crew of an airliner that descended into the Everglades after its autopilot was inadvertently switched off) and the loss of effective Crew Resource Management (CRM) as major explanations of accidents (e.g., Aeronautica Civil, 1996). A most frequently quoted sequence of events with respect to CRM is the flight of an iced-up airliner from Washington National Airport in the winter of 1982 that ended shortly after takeoff on the 14th Street bridge and in the Potomac River. The basic cause of the accident is said to be the copilot's unassertive remarks about an irregular engine instrument reading (despite the fact that the copilot was known for his assertiveness). This supposed explanation hides many other factors which might be more relevant, including airtraffic control pressures, the controversy surrounding rejected takeoffs close to decision speed, the sensitivity of the aircraft type to icing and its pitch-up tendency with even little ice on the slats (devices on the wing's leading edge that help it fly at slow speeds), and ambiguous engineering language in the airplane manual to describe the conditions for use of engine anti-ice. In an effort to explain complex behavior, and still make a connection to the applied worlds from which it owes its existence, transportation human factors may be doing itself a disservice by inventing and uncritically using folk models. If we use models that do not articulate the performance measures that can be used in the particular contexts that we want to speak about, we can make no progress in better understanding the sources of success and failure in our operational environments. 7 Chapter Why Don't They Follow the Procedures? People do not always follow procedures. We can easily observe this when watching people at work, and managers, supervisors and regulators (or anybody else responsible for safe outcomes of work) often consider it to be a large practical problem. In chapter 6 we saw how complacency would be a very unsatisfactory label for explaining practical drift away from written guidance. But what lies behind it then? In hindsight, after a mishap, rule violations seem to play such a dominant causal role. If only they had followed the procedure! Studies keep returning the basic finding that procedure violations precede accidents. For example, an analysis carried out for an aircraft manufacturer identified "pilot
  • 79. deviation from basic operational procedure" as primary factor in almost 100 accidents (Lautman & Gallimore, 1987, p. 2). One methodological problem with such work is that it selects its cases on the dependent variable (the accident), thereby generating tautologies rather than findings. But performance variations, especially those at odds with written guidance, easily get overestimated for their role in the sequence of events: The interpretation of what happened may then be distorted by naturalistic biases to overestimate the possible causal role of unofficial action or procedural violation. . . . While it is possible to show that violations of procedures are involved in many safety events, many violations of procedures are not, and indeed some violations (strictly interpreted) appear to represent more effective ways of working. (McDonald, Corrigan, & Ward, 2002, pp. 3-5) 132 PROCEDURES 133 As seen in chapter 4, hindsight turns complex, tangled histories laced with uncertainty and pressure into neat, linear anecdotes with obvious choices. What look like violations from the outside and hindsight are often actions that make sense given the pressures and trade-offs that exist on this inside of real work. Finding procedure violations as causes or contributors to mishaps, in other words, says more about us, and the biases we introduce when looking back on a sequence of events, than it does about people who were doing actual work at the time. Yet if procedure violations are judged to be such a large ingredient of mishaps, then it can be tempting, in the wake of failure, to introduce even more procedures, or to change existing ones, or to enforce stricter compliance. For example, shortly after a fatal shootdown of two U.S. Black Hawk helicopters over Northern Iraq by U.S. fighter jets, "higher headquarters in Europe dispatched a sweeping set of rules in documents several inches thick to 'absolutely guarantee' that whatever caused this tragedy would never happen again" (Snook, 2000, p. 201). It is a common, but not typically satisfactory, reaction. Introducing more procedures does not necessarily avoid the next incident, nor do exhortations to follow rules more carefully necessarily increase compliance or enhance safety. In the end, a mismatch between procedures and practice is not unique to accident sequences. Not following procedures does not necessarily lead to trouble, and safe outcomes may be preceded by just as many procedural deviations as accidents are. PROCEDURE APPLICATION AS RULE-FOLLOWING When rules are violated, are these bad people ignoring the rules? Or are these bad rules, ill matched to the demands of real work? To be sure, procedures, with the aim of standardization, can play an important role in shaping safe practice. Commercial aviation is often held up as a prime example of the powerful effect of standardization on safety. But there is a deeper, more complex dynamic where real practice is continually adrift from official written guidance, settling at times, unsettled and shifting at others. There is a deeper, more complex interplay whereby practice sometimes precedes and defines the rules rather than being defined by them. In those cases, is a violation an expression of defiance, or an expression of compliance—people following practical rules rather than official, impractical ones? These possibilities lie between two opposing models of what procedures mean, and what they in turn mean for safety. These models of procedures guide how organizations think about making progress on safety. The first 134 CHAPTER7 model is based on the notion that not following procedures can lead to unsafe situations. These are its premises: • Procedures represent the best thought-out, and thus the safest, way to carry out a job. • Procedure following is mostly simple IF-THEN rule-based mental activity: IF this situation occurs, THEN this algorithm (e.g., checklist) applies.
  • 80. • Safety results from people following procedures. • For progress on safety, organizations must invest in people's knowledge of procedures and ensure that procedures are followed. In this idea of procedures, those who violate them are often depicted as putting themselves above the law. These people may think that rules procedures are made for others, but not for them, as they know how to really do the job. This idea of rules and procedures suggests that there is something exceptionalist or misguidedly elitist about those who choose not to follow the rules. After a maintenance related mishap, for example, investigators found that "the engineers who carried out the flap change demonstrated a willingness to work around difficulties without reference to the design authority, including situations where compliance with the maintenance manual could not be achieved" (Joint Aviation Authorities, 2001). The engineers demonstrated a "willingness." Such terminology embodies notions of volition (the engineers had a free choice either to comply or not) and full rationality (they knew what they were doing). They violated willingly. Violators are wrong, because rules and procedures prescribe the best, safest way to do a job, independent of who does that job. Rules and procedures are for everyone. Such characterizations are naive at best, and always misleading. If you know where to look, daily practice is testimony to the ambiguity of procedures, and evidence that procedures are a rather problematic category of human work. First, real work takes place in a context of limited resources and multiple goals and pressures. Procedures assume that there is time to do them in, certainty (of what the situation is), and sufficient information available (e.g., about whether tasks are accomplished according to the procedure). This already keeps rules at a distance from actual tasks, because real work seldom meets those criteria. Work-to-rule strikes show how it can be impossible to follow the rules and get the job done at the same time. Aviation line maintenance is emblematic: A job-perception gap exists where supervisors are convinced that safety and success result from mechanics following procedures—a sign-off means that applicable procedures were followed. But mechanics may encounter problems for which the right PROCEDURES 135 tools or parts are not at hand; the aircraft may be parked far away from base. Or there may be too little time: Aircraft with a considerable number of problems may have to be turned around for the next flight within half an hour. Mechanics, consequently, see success as the result of their evolved skills at adapting, inventing, compromising, and improvising in the face of local pressures and challenges on the line—a sign-off means the job was accomplished in spite of resource limitations, organizational dilemmas, and pressures. Those mechanics who are most adept are valued for their productive capacity even by higher organizational levels. Unacknowledged by those levels, though, are the vast informal work systems that develop so mechanics can get work done, advance their skills at improvising and satisficing, impart them to one another, and condense them in unofficial, self-made documentation (McDonald et al., 2002). Seen from the outside, a defining characteristic of such informal work systems would be routine nonconformity. But from the inside, the same behavior is a mark of expertise, fueled by professional and interpeer pride. And of course, informal work systems emerge and thrive in the first place because procedures are inadequate to cope with local challenges and surprises, and because procedures' conception of work collides with the scarcity, pressure and multiple goals of real work. Some of the safest complex, dynamic work not only occurs despite the procedures—such as aircraft line maintenance—but without procedures altogether. Rochlin et al. (1987, p. 79), commenting on the introduction of ever heavier and capable aircraft onto naval aircraft carriers, noted that "there were no books on the integration of this new hardware into existing routines and no other place to practice it but at sea. Moreover, little of the process was written down, so that the ship in operation is the only reliable
  • 81. manual." Work is "neither standardized across ships nor, in fact, written down systematically and formally anywhere." Yet naval aircraft carriers, with inherent high-risk operations, have a remarkable safety record, like other so-called high-reliability organizations (Rochlin, 1999; Rochlin, LaPorte, & Roberts, 1987). Documentation cannot present any close relationship to situated action because of the unlimited uncertainty and ambiguity involved in the activity. Especially where normal work mirrors the uncertainty and criticality of emergencies, rules emerge from practice and experience rather than preceding it. Procedures, in other words, end up following work instead of specifying action beforehand. Human factors has so far been unable to trace and model such coevolution of human and system, of work and rules. Instead, it has typically imposed a mechanistic, static view of one best practice from the top down. Procedure-following can also be antithetical to safety. In the 1949 U.S. Mann Gulch disaster, firefighters who perished were the ones sticking to 136 CHAPTER 7 the organizational mandate to carry their tools everywhere (Weick, 1993). In this case, as in others (e.g., Carley, 1999), people faced the choice between following the procedure or surviving. Procedures Are Limited in Rationalizing Human Work This, then, is the tension. Procedures are seen as an investment in safety— but it turns out that they not always are. Procedures are thought to be required to achieve safe practice—yet they are not always necessary, nor likely ever sufficient for creating safety. Procedures spell out how to do the job safely—yet following all the procedures can lead to an inability to get the job done. Though a considerable practical problem, such tensions are underreported and underanalyzed in the human factors literature. There is always a distance between a written rule and an actual task. This distance needs to be bridged; the gap must be closed, and the only thing that can close it is human interpretation and application. Ethnographer Ed Hutchins has pointed out how procedures are not just externalized cognitive tasks (Wright & McCarthy, 2003). Externalizing a cognitive task would transplanted it from the head to the world, for example onto a checklist. Rather, following a procedure requires cognitive tasks that are not specified in the procedure; transforming the written procedure into activity requires cognitive work. Procedures are inevitably incomplete specifications of action: They contain abstract descriptions of objects and actions that relate only loosely to particular objects and actions that are encountered in the actual situation (Suchman, 1987). Take as an example the lubrication of the jackscrew on MD-80s from chapter 2—something that was done incompletely and at increasingly greater intervals before the crash of Alaska 261. This is part of the written procedure that describes how the lubrication work should be done (NTSB, 2002, pp. 29-30): A. Open access doors 6307, 6308, 6306 and 6309 B. Lube per the following . . . 3. JACKSCREW Apply light coat of grease to threads, then operate mechanism through full range of travel to distribute lubricant over length of jackscrew. C. Close doors 6307, 6308, 6306 and 6309 This leaves a lot to the imagination, or to the mechanic's initiative. How much is a "light" coat? Do you do apply the grease with a brush (if a "light coat" is what you need), or do you pump it onto the parts directly with the grease gun? How often should the mechanism (jackscrew plus nut) be op- PROCEDURES 137 crated through its full range of travel during the lubrication procedure? None of this is specified in the written guidance. It is little wonder that: Investigators observed that different methods were used by maintenance personnel to accomplish certain steps in the lubrication procedure, including the manner in which grease was applied to the acme nut fitting and the acme screw and the number of times the trim system was cycled to distribute the
  • 82. grease immediately after its application. (NTSB, 2002, p. 116) In addition, actually carrying out the work is difficult enough. As noted in chapter 2, the access panels of the horizontal stabilizer were just large enough to allow a hand through, which would then block the view of anything that went on inside. As a mechanic, you can either look at what you have to do or what you have just done, or actually do it. You cannot do both at the same time, because the access doors are too small. This makes judgments about how well the work is being done rather difficult. The investigation discovered as much when they interviewed the mechanic responsible for the last lubrication of the accident airplane: "When asked how he determined whether the lubrication was being accomplished properly and when to stop pumping the grease gun, the mechanic responded, 'I don't' " (NTSB, 2002, p. 31). The time the lubrication procedure took was also unclear, as there was ambiguity about which steps were included in the procedure. Where does the procedure begin and where does it end, after access has been created to the area, or before? And is closing the panels part of it as well, as far as time estimates are concerned? Having heard that the entire lubrication process takes "a couple of hours," investigators learned from the mechanic of the accident airplane that: the lubrication task took "roughly. . . probably an hour" to accomplish. It was not entirely clear from his testimony whether he was including removal of the access panels in his estimate. When asked whether his 1-hour estimate included gaining access to the area, he replied, "No, that would probably take a little—well, you've got probably a dozen screws to take out of the one panel, so that's—I wouldn't think any more than an hour." The questioner then stated, "including access?," and the mechanic responded, 'Yeah." (NTSB, 2002, p. 32) As the procedure for lubricating the MD-80 jackscrew indicates, and McDonald et al. (2002) remind us, formal documentation cannot be relied on, nor is it normally available in a way which supports a close relationship to action. There is a distinction between universalistic and particularistic rules: Universalistic rules are very general prescriptions (e.g., "Apply light coat of grease to threads"), but remain at a distance from their actual appli- 138 CHAPTER7 cation. In fact, all universalistic rules or general prescriptions develop into particularistic rules as experience accumulates. With experience, people encounter the conditions under which universalistic rules need to be applied, and become increasingly able to specify those conditions. As a result, universalistic rules assume appropriate local expressions through practice. Wright and McCarthy (2003) have pointed out that procedures come out of the scientific management tradition, where their main purpose was a minimization of human variability, maximization of predictability, a rationalization of work. Aviation contains a strong heritage: Procedures in commercial aviation represent and allow a routinization that makes it possible to conduct safety-critical work with perfect strangers. Procedures are a substitute for knowing coworkers. The actions of a copilot are predictable not because the copilot is known (in fact, you may never have flown with him or her), but because the procedures make them predictable. Without such standardization it would be impossible to cooperate safely and smoothly with unknown people. In the spirit of scientific management, human factors also assumes that order and stability in operational systems are achieved rationally, mechanistically, and that control is implemented vertically (e.g., through task analyses that produce prescriptions of work to be carried out). In addition, the strong influence of information-processing psychology on human factors has reinforced the idea of procedures as IF-THEN rule following, where procedures are akin to a program in a computer that in turn serves as input signals to the human information processor. The algorithm specified by the procedure becomes the software on which the human processor runs.
  • 83. But it is not that simple. Following procedures in the sense of applying them in practice requires more intelligence. It requires additional cognitive work. This brings us to the second model of procedures and safety. PROCEDURE APPLICATION AS SUBSTANTIVE COGNITIVE ACTIVITY People at work must interpret procedures with respect to a collection of actions and circumstances that the procedures themselves can never fully specify (e.g., Suchman, 1987). In other words, procedures are not the work itself. Work, especially that in complex, dynamic workplaces, often requires subtle, local judgments with regard to timing of subtasks, relevance, importance, prioritization, and so forth. For example, there is no technical reason why a before-landing checklist in a commercial aircraft could not be automated. The kinds of items on such a checklist (e.g., hydraulic pumps, gear, flaps) are mostly mechanical and could be activated on the basis of predetermined logic without having to rely on, or constantly remind, a hu- PROCEDURES 139 man to do so. Yet no before-landing checklist is fully automated today. The reason is that approaches for landing differ—they can differ in terms of timing, workload, or other priorities. Indeed, the reason is that the checklist is not the job itself. The checklist is, to repeat Suchman, a resource for action; it is one way for people to help structure activities across roughly similar yet subtly different situations. Variability in this is inevitable. Circumstances change, or are not as foreseen by those who designed the procedures. Safety, then, is not the result of rote rule following; it is the result of people's insight into the features of situations that demand certain actions, and people being skillful at finding and using a variety of resources (including written guidance) to accomplish their goals. This suggests a second model on procedures and safety: • Procedures are resources for action. Procedures do not specify all circumstances to which they apply. Procedures cannot dictate their own application. • Applying procedures successfully across situations can be a substantive and skillful cognitive activity. • Procedures cannot, in themselves, guarantee safety. Safety results from people being skillful at judging when and how (and when not) to adapt procedures to local circumstances. • For progress on safety, organizations must monitor and understand the reasons behind the gap between procedures and practice. Additionally, organizations must develop ways that support people's skill at judging when and how to adapt. Procedures and Unusual Situations Although there is always a distance between the logics dictated in written guidance and real actions to be taken in the world, prespecified guidance is especially inadequate in the face of novelty and uncertainty. Adapting procedures to fit unusual circumstances is a substantive cognitive activity. Take for instance the crash of a large passenger aircraft near Halifax, Nova Scotia in 1998. After an uneventful departure, a burning smell was detected and, not much later, smoke was reported inside the cockpit. Carley (1999) characterized the two pilots as respective embodiments of the models of procedures and safety: The co-pilot preferred a rapid descent and suggested dumping fuel early so that the aircraft would not be too heavy to land. But the captain told the copilot, who was flying the plane, not to descend too fast, and insisted they cover applicable procedures (checklists) for dealing with smoke and fire. The captain delayed a decision on dumping fuel. With the fire developing, the aircraft became uncontrollable and crashed into 140 CHAPTER 7 the sea, taking all 229 lives onboard with it. There were many good reasons for not immediately diverting to Halifax: Neither pilot was familiar with the airport, they would have to fly an approach procedure that they were not very proficient at, applicable charts and information on the airport were
  • 84. not easily available, and an extensive meal service had just been started in the cabin. Yet, part of the example illustrates a fundamental double bind for those who encounter surprise and have to apply procedures in practice (Woods& Shattuck, 2000): • If rote rule following persists in the face of cues that suggest procedures should be adapted, this may lead to unsafe outcomes. People can get blamed for their inflexibility, their application of rules without sensitivity to context. • If adaptations to unanticipated conditions are attempted without complete knowledge of circumstance or certainty of outcome, unsafe results may occur too. In this case, people get blamed for their deviations their nonadherence. In other words, people can fail to adapt, or attempt adaptations that may fail. Rule following can become a desynchronized and increasingly irrelevant activity, decoupled from how events and breakdowns are really unfolding and multiplying throughout a system. In the Halifax crash, as is often the case, there was uncertainty about the very need for adaptations (How badly ailing was the aircraft, really?) as well as uncertainty about the effect and safety of adapting: How much time would the crew have to change their plans? Could they skip fuel dumping and still attempt a landing? Potential adaptations, and the ability to project their potential for success, were not necessarily supported by specific training or overall professional indoctrination. Civil aviation, after all, tends to emphasize model the first model: Stick with procedures and you will most likely be safe (e.g., Lautman & Gallimore, 1987). Tightening procedural adherence, through threats of punishment or other supervisory interventions, does not remove the double bind. In fact, it may tighten the double bind—making it more difficult for people to develop judgment of how and when to adapt. Increasing the pressure to comply increases the probability of failures to adapt—compelling people to adopt a more conservative response criterion. People will require more evidence for the need to adapt, which takes time, and time may be scarce in cases that call for adaptation (as in the aforementioned case). Merely stressing the importance of following procedures can increase the number of cases in which people fail to adapt in the face of surprise. PROCEDURES 141 Letting people adapt without adequate skill or preparation, on the other hand, can increase the number of failed adaptations. One way out of the double bind is to develop people's skill at adapting. This means giving them the ability to balance the risks between the two possible types of failure: failing to adapt or attempting adaptations that may fail. It requires the development of judgment about local conditions and the opportunities and risks they present, as well as an awareness of larger goals and constraints that operate on the situation. Development of this skill could be construed, to paraphrase Rochlin (1999), as planning for surprise. Indeed, as Rochlin (p. 1549) observed, the culture of safety in high-reliability organizations anticipates and plans for possible failures in "the continuing expectation of future surprise." Progress on safety also hinges on how an organization responds in the wake of failure (or even the threat of failure). Post-mortems can quickly reveal a gap between procedures and local practice, and hindsight inflates the causal role played by unofficial action (McDonald et al, 2002). The response, then, is often to try to forcibly close the gap between procedures and practice, by issuing more procedures or policing practice more closely. The role of informal patterns of behavior, and what they represent (e.g., resource constraints, organizational deficiencies or managerial ignorance, countervailing goals, peer pressure, professionalism and perhaps even better ways of working) all go misunderstood. Real practice, as done in the vast informal work systems, is driven and kept underground. Even though failures
  • 85. offer each sociotechnical system an opportunity for critical self-examination, accident stories are developed in which procedural deviations play a major, evil role, and are branded as deviant and causal. The official reading of how the system works or is supposed to work is once again re-invented: Rules mean safety, and people should follow them. High-reliability organizations, in contrast, distinguish themselves by their constant investment in trying to monitor and understand the gap between procedures and practice. The common reflex is not to try to close the gap, but to understand why it exists. Such understanding provides insight into the grounds for informal patterns of activity and opens ways to improve safety by sensitivity to people's local operational context. The Regulator: From Police to Partner That there is always a tension between centralized guidance and local practice creates a clear dilemma for those tasked with regulating safety-critical industries. The dominant regulatory instrument consists of rules and checking that those rules are followed. But forcing operational people to stick to rules can lead to ineffective, unproductive or even unsafe local actions. For various jobs, following the rules and getting the task done are mu- 142 CHAPTER7 tually exclusive. On the other hand, letting people adapt their local practice in the face of pragmatic demands can make them sacrifice global system goals or miss other constraints or vulnerabilities that operate on the system. Helping people solve this fundamental trade-off is not a matter of pushing the criterion one way or the other. Discouraging people's attempts at adaptation can increase the number of failures to adapt in situations where adaptation was necessary. Allowing procedural leeway without encouraging organizations to invest in people's skills at adapting, on the other hand, can increase the number of failed attempts at adaptation. This means that the gap between rule and task, between written procedure and actual job, needs to be bridged by the regulator as much as by the operator. Inspectors who work for regulators need to apply rules as well: find out what exactly the rules mean and what their implications are when imposed on a field of practice. The development from universalism to particularism applies to regulators too. This raises questions about the role that inspectors should play. Should they function as police—checking to what extent the market is abiding by the laws they are supposed to uphold? In that case, should they apply a black-and-white judgment (which would ground a number of companies immediately)? Or, if there is a gap between procedure and practice that inspectors and operators share and both need to bridge, can inspectors be partners in joint efforts toward progress on safety? The latter role is one that can only develop in good faith, though such good faith may be the very by-product of the development of a new kind of relationship, or partnership, towards progress on safety. Mismatches between rules and practice are no longer seen as the logical conclusion of an inspection, but rather as the starting point, the beginning of joint discoveries about real practice and the context in which it occurs. What are the systemic reasons (organizational, regulatory, resource related) that help create and sustain the mismatch? The basic criticism of an inspector's role as partner is easy to anticipate: Regulators should not come too close to the ones they regulate, lest their relationship become too cozy and objective judgment of performance against safety criteria become impossible. But regulators need to come close to those they regulate in any case. Regulators (or their inspectors) need to be insiders in the sense of speaking the language of the organization they inspect, understanding the kind of business they are in, in order to gain the respect and credibility of the informants they need most. At the same time, regulators need to be outsiders—resisting getting integrated into the worldview of the one they regulate. Once on the inside of that system and its worldview, it may be increasingly difficult to discover the potential drift into failure. What is normal to the operator is normal to the inspector.
  • 86. The tension between having to be an insider and an outsider at the same time is difficult to resolve. The conflictual, adversarial model of safety regu- PROCEDURES 143 lation has in many cases not proven productive. It leads to window dressing and posturing on the part of the operator during inspections, and secrecy and obfuscation of safety- and work-related information at all other times. As airline maintenance testifies, real practice is easily driven underground. Even for regulators who apply their power as police rather than as partner, the struggle of having to be insider and outsider at the same time is not automatically resolved. Issues of access to information (the relevant information about how people really do their work, even when the inspector is not there) and inspector credibility, demand that there be a relationship between regulator and operator that allows such access and credibility to develop. Organizations (including regulators) who wish to make progress on safety with procedures need to: • Monitor the gap between procedure and practice and try to understand why it exists (and resist trying to close it simply telling people to comply). • Help people develop skills to judge when and how to adapt (and resist only telling people they should follow procedures). But many organizations or industries do neither. They may not even know, or want to know (or be able to afford to know) about the gap. Take aircraft maintenance again. A variety of workplace factors (communication problems, physical or hierarchical distance, industrial relations) obscure the gap. For example, continued safe outcomes of existing practice give supervisors no reason to question their assumptions about how work is done (if they are safe they must be following procedures down there). There is wider industry ignorance, however (McDonald et al., 2002). In the wake of failure, informal work systems typically retreat from view, gliding out of investigators' reach. What goes misunderstood, or unnoticed, is that informal work systems compensate for the organization's inability to provide the basic resources (e.g., time, tools, documentation with a close relationship to action) needed for task performance. Satisfied that violators got caught and that formal prescriptions of work were once again amplified, the organizational system changes little or nothing. It completes another cycle of stability, typified by a stagnation of organizational learning and no progress on safety (McDonald et al.). GOAL CONFLICTS AND PROCEDURAL DEVIANCE As discussed in chapter 2, a major engine behind routine divergence from written guidance is the need to pursue multiple goals simultaneously. Multiple goals mean goal conflicts. As Dorner (1989) remarked, "Contradictory 144 CHAPTER 7 goals are the rule, not the exception, in complex situations" (p. 65). In a study of flight dispatchers, for example, Smith (2001) illustrated the basic dilemma. Would bad weather hit a major hub airport or not? What should the dispatchers do with all the airplanes en route? Safety (by making aircraft divert widely around the weather) would be a pursuit that "tolerates a false alarm but deplores a miss" (p. 361). In other words, if safety is the major goal, then making all the airplanes divert even if the weather would not end up at the hub (a false alarm) is much better than not making them divert and sending them headlong into bad weather (a miss). Efficiency, on the other hand, severely discourages the false alarm, whereas it can actually deal with a miss. As discussed in chapter 2, this is the essence of most operational systems. Though safety is a (stated) priority, these systems do not exist to be safe. They exist to provide a service or product, to achieve economic gain, to maximize capacity utilization. But still they have to be safe. One starting point, then, for understanding a driver behind routine deviations, is to look deeper into these goal interactions, these basic incompatibilities in what people need to strive for in their work. Of particular interest is how people
  • 87. themselves view these conflicts from inside their operational reality, and how this contrasts with management (and regulator) views of the same activities. NASA's "Faster, Better, Cheaper" organizational philosophy in the late 1990s epitomized how multiple, contradictory goals are simultaneously present and active in complex systems. The loss of the Mars Climate Orbiter and the Mars Polar Lander in 1999 were ascribed in large part to the irreconcilability of the three goals (faster and better and cheaper), which drove down the cost of launches, made for shorter, aggressive mission schedules, eroded personnel skills and peer interaction, limited time, reduced the workforce, and lowered the level of checks and balances normally found (National Aeronautics and Space Administration, 2000). People argued that NASA should pick any two from the three goals. Faster and cheaper would not mean better. Better and cheaper would mean slower. Faster and better would be more expensive. Such reduction, however, obscures the actual reality facing operational personnel in safety-critical settings. These people are there to pursue all three goals simultaneously—fine-tuning their operation, as Starbuck and Milliken (1988) said, to "render it less redundant, more efficient, more profitable, cheaper, or more versatile" (p. 323), fine-tuning, in other words, to make it faster, better, cheaper. The 2003 Space Shuttle Columbia accident focused attention on the maintenance work that was done on the Shuttle's external fuel tank, once again revealing the differential pressures of having to be safe and getting the job done (better, but also faster and cheaper). A mechanic working for the contractor, whose task it was to apply the insulating foam to the exter- PROCEDURES 145 nal fuel tank, testified that it took just a couple of weeks to learn how to get the job done, thereby pleasing upper management and meeting production schedules. An older worker soon showed him how he could mix the base chemicals of the foam in a cup and brush it over scratches and gouges in the insulation, without reporting the repair. The mechanic soon found himself doing this hundreds of times, each time without filling out the required paperwork. Scratches and gouges that were brushed over with the mixture from the cup basically did not exist as far as the organization was concerned. And those that did not exist could not hold up the production schedule for the external fuel tanks. Inspectors often did not check. A company program that once had paid workers hundreds of dollars for finding defects had been watered down, virtually inverted by incentives for getting the job done now. Goal interactions are critical in such experiences, which contain all the ingredients of procedural fluidity, maintenance pressure, the meaning of incidents worth reporting, and their connections to drift into failure. As in most operational work, the distance between formal, externally dictated logics of action and actual work is bridged with the help of those who have been there before, who have learned how to get the job done (without apparent safety consequences), and who are proud to share their professional experience with younger, newer workers. Actual practice by newcomers settles at a distance from the formal description of the job. Deviance becomes routinized. This is part of the vast informal networks characterizing much maintenance work, including informal hierarchies of teachers and apprentices, informal documentation of how to actually get work done, informal procedures and tasks, and informal teaching practices. Inspectors did not check, did not know, or did not report. Managers were happy that production schedules were met and happy that fewer defects were being discovered—normal people doing normal work in a normal organization. Or that is what it seemed to everybody at the time. Once again, the notion of an incident, of something that was worthy of reporting (a defect) got blurred against a background of routine nonconformity. What was normal versus what was deviant was no longer so clear. Goal conflicts between safer, better, and cheaper were reconciled by doing the work more cheaply, superficially better (brushing over gouges), and apparently without cost to safety. As long as orbiters kept coming
  • 88. back safely, the contractor must have been doing something right. Understanding the potential side effects was very difficult given the historical mission success rate. Lack of failures were seen as a validation that current strategies to prevent hazards were sufficient. Could anyone foresee, in a vastly complex system, how local actions as trivial as brushing chemicals from a cup could one day align with other factors to push the system over the edge? Recall from chapter 2: What cannot be believed cannot be seen. Past success was taken as guarantee of continued safety. 146 CHAPTER 7 The Internalization of External Pressure Some organizations pass on their goal conflicts to individual practitioners quite openly. Some airlines, for example, pay their crews a bonus for ontime performance. An aviation publication commented on one of those operators (a new airline called Excel, flying from England to holiday destinations): "As part of its punctuality drive, Excel has introduced a bonus scheme to give employees a bonus should they reach the agreed target for the year. The aim of this is to focus everyone's attention on keeping the aircraft on schedule" (Airliner World, 2001, p. 79). Such plain acknowledgment of goal priorities, however, is not common. Most important goal conflicts are never made so explicit, arising rather from multiple irreconcilable directives from different levels and sources, from subtle and tacit pressures, from management or customer reactions to particular trade-offs. Organizations often resort to "conceptual integration, or plainly put, doublespeak" (Dorner, 1989, p. 68). For example, the operating manual of another airline opens by stating that "(1) our flights shall be safe; (2) our flights shall be punctual; (3) our customers will find value for money." Conceptually, this is Dorner's (1989) doublespeak, documentary integration of incompatibles. It is impossible, in principle, to do all three simultaneously, as with NASA's faster, better, cheaper. Whereas incompatible goals arise at the level of an organization and its interaction with its environment, the actual managing of goal conflicts under uncertainty gets pushed down into local operating units—control rooms, cockpits, and the like. There the conflicts are to be negotiated and resolved in the form of thousands of little and larger daily decisions and trade-offs. These are no longer decisions and trade-offs made by the organization, but by individual operators or crews. It is this insidious delegation, this hand-over, where the internalization of external pressure takes place. Crews of one airline describe their ability to negotiate these multiple goals while under the pressure of limited resources as "the blue feeling" (referring to the dominant color of their fleet). This feeling represents the willingness and ability to put in the work to actually deliver on all three goals simultaneously (safety, punctuality, and value for money). This would confirm that practitioners do pursue incompatible goals of faster, better, and cheaper all at the same time and are aware of it too. In fact, practitioners take their ability to reconcile the irreconcilable as a source of considerable professional pride. It is seen as a strong sign of their expertise and competence. The internalization of external pressure, this usurpation of organizational goal conflicts by individual crews or operators, is not well described or modeled yet. This, again, is a question about the dynamics of the macro-micro connection that we saw in chapter 2. How is it that a global tension between efficiency and safety seeps into local decisions and trade- PROCEDURES 147 offs by individual people or groups? These macrostructural forces, which operate on an entire company, find their most prominent expression in how local work groups make assessments about opportunities and risks (see also Vaughan, 1996). Institutional pressures are reproduced, or perhaps really manifested, in what individual people do, not by the organization as a whole. But how does this connection work? Where do external pressures become internal? When do the problems and interests of an organization under pressure of resource scarcity and competition become
  • 89. the problems and interests of individual actors at several levels within that organization? The connection between external pressure and its internalization is relatively easy to demonstrate when an organization explicitly advertises how operators' pursuit of one goal will lead to individual rewards (a bonus scheme to keep everybody focused on the priority of schedule). But such cases are probably rare, and it is doubtful whether they represent actual internalization of a goal conflict. It becomes more difficult when the connection and the conflicts are more deeply buried in how operators transpose global organizational aims onto individual decisions. For example, the blue feeling signals aircrews' strong identification with their organization (which flies blue aircraft) and what it and its brand stand for (safety, reliability, value for money). Yet it is a feeling that only individuals or crews can have, a feeling because it is internalized. Insiders point out how some crews or commanders have the blue feeling whereas others do not. It is a personal attribute, not an organizational property. Those who do not have the blue feeling are marked by their peers—seldom supervisors—for their insensitivity to, or disinterest in, the multiplicity of goals, for their unwillingness to do substantive cognitive work necessary to reconcile the irreconcilable. These practitioners do not reflect the corps' professional pride because they will always make the easiest goal win over the others (e.g., "Don't worry about customer service or capacity utilization, it's not my job"), choosing the path of least resistance and least work in the eyes of their peers. In the same airline, those who try to adhere to minute rules and regulations are called "Operating Manual worshippers"—a clear signal that their way of dealing with goal contradictions is not only perceived as cognitively cheap (just go back to the book, it will tell you what to do), but as hampering the collective ability to actually get the job done, diluting the blue feeling. The blue feeling, then, is also not just a personal attribute, but an interpeer commodity that affords comparisons, categorizations, and competition among members of the peer group, independent of other layers or levels in the organization. Similar interpeer pride and perception operate as subtle engine behind the negotiation among different goals in other professions too, for example flight dispatchers, air-traffic controllers, or aircraft maintenance workers (McDonald et al., 2002). 148 CHAPTER 7 The latter group (aircraft maintenance) has incorporated even more internal mechanisms to deal with goal interactions. The demand to meet technical requirements clashes routinely with time or other resource constraints such as inadequate time, personnel, tools, parts, or functional work environment (McDonald et al., 2002). The vast internal, sub-surface networks of routines, illegal documentation, and shortcuts, which from the outside would be seen as massive infringement of existing procedures, are a result of the pressure to reconcile and compromise. Actual work practices constitute the basis for technicians' strong professional pride and sense of responsibility for delivering safe work that exceeds even technical requirements. Seen from the inside, it is the role of the technician to apply judgment founded on his or her knowledge, experience, and skill—not on formal procedure. Those most adept at this are highly valued for their productive capacity even by higher organizational levels. Yet upon formal scrutiny (e.g., an accident inquiry), informal networks and practices often retreat from view, yielding only a bare-bones version of work in which the nature of goal compromises and informal activities is never explicit, acknowledged, understood, or valued. Similar to the British Army on the Somme, management in some maintenance organizations occasionally decides (or pretends) that there is no local confusion, that there are no contradictions or surprises. In their official understanding, there are rules and people who follow the rules, and safe outcomes as a result. People who do not follow the rules are more prone to causing accidents, as the hindsight bias inevitably points out. To people on the work floor, in contrast, management
  • 90. does not even understand the fluctuating pressures on their work, let alone the strategies necessary to accommodate those (McDonald et al.). Both cases (the blue feeling and maintenance work) challenge human factors' traditional reading of violations as deviant behavior. Human factors wants work to mirror prescriptive task analyses or rules, and violations breach vertical control implemented through such managerial or design directives. Seen from the inside of people's own work, however, violations become compliant behavior. Cultural understandings (e.g., expressed in notions of a blue feeling) affect interpretative work, so that even if people's behavior is objectively deviant, they will see their own conduct as conforming (Vaughan, 1999). Their behavior is compliant with the emerging, local, internalized ways to accommodate multiple goals important to the organization (maximizing capacity utilization but doing so safely, meeting technical requirements, but also deadlines). It is compliant, also, with a complex of peer pressures and professional expectations in which unofficial action yields better, quicker ways to do the job; in which unofficial action is a sign of competence and expertise; where unofficial action can override or outsmart hierarchical control and compensate for higher level organizational deficiencies or ignorance. PROCEDURES 149 ROUTINE NONCONFORMITY The gap between procedures and practice is not constant. After the creation of new work (e.g., through the introduction of new technology), time can go by before applied practice stabilizes, likely at a distance from the rules as written for the system on the shelf. Social science has characterized this migration from tightly coupled rules to more loosely coupled practice variously as "fine-tuning" (Starbuck & Milliken, 1988) or "practical drift" (Snook, 2000). Through this shift, applied practice becomes the pragmatic imperative; it settles into a system as normative. Deviance (from the original rules) becomes normalized; nonconformity becomes routine (Vaughan, 1996). The literature has identified important ingredients in the normalization of deviance, which can help organizations understand the nature of the gap between procedures and practice: • Rules that are overdesigned (written for tightly coupled situations, for the worst case) do not match actual work most of the time. In real work, there is slack: time to recover, opportunity to reschedule and get the job done better or more smartly (Starbuck & Milliken). This mismatch creates an inherently unstable situation that generates pressure for change (Snook). • Emphasis on local efficiency or cost effectiveness pushes operational people to achieve or prioritize one goal or a limited set of goals (e.g., customer service, punctuality, capacity utilization). Such goals are typically easily measurable (e.g., customer satisfaction, on-time performance), whereas it is much more difficult to measure how much is borrowed from safety. • Past success is taken as guarantee of future safety. Each operational success achieved at incremental distances from the formal, original rules can establish a new norm. From here a subsequent departure is once again only a small incremental step (Vaughan). From the outside, such fine-tuning constitutes incremental experimentation in uncontrolled settings (Starbuck & Milliken)—on the inside, incremental nonconformity is an adaptive response to scarce resources, multiple goals, and often competition. • Departures from the routine become routine. Seen from the inside of people's own work, violations become compliant behavior. They are compliant with the emerging, local ways to accommodate multiple goals important to the organization (maximizing capacity utilization but doing so safely; meeting technical requirements, but also deadlines). They are compliant, also, with a complex of peer pressures and professional expectations in which unofficial action yields better, quicker ways to do the job; in which unofficial action is a sign of competence and expertise; where unofficial action can override or outsmart hierarchical control and compensate for
  • 91. higher level organizational deficiencies or ignorance. 150 CHAPTER 7 Although a gap between procedures and practice always exists, there are different interpretations of what this gap means and what to do about it. As pointed out in chapter 6, human factors may see the gap between procedures and practice as a sign of complacency—operators' self-satisfaction with how safe their practice or their system is or a lack of discipline. Psychologists may see routine nonconformity as expressing a fundamental tension between multiple goals (production and safety) that pull workers in opposite directions: getting the job done but also staying safe. Others highlight the disconnect that exists between distant supervision or preparation of the work (as laid down in formal rules) on the one hand, and local, situated action on the other. Sociologists may see in the gap a political lever applied on management by the work floor, overriding or outsmarting hierarchical control and compensating for higher level organizational deficiencies or ignorance. To the ethnographer, routine nonconformity would be interesting not just because of what it says about the work or the work context, but because of what it says about what the work means to the operator. The distance between procedures and practice can create widely divergent images of work. Is routine nonconformity an expression of elitist operators who consider themselves to be above the law, of people who demonstrate a willingness to ignore the rules? Work in that case is about individual choices, supposedly informed choices between doing that work well or badly, between following the rules or not. Or is routine nonconformity a systematic by-product of the social organization of work, where it emerges from the interactions between organizational environment (scarcity and competition), internalized pressures, and the underspecified nature of written guidance? In that case, work is seen as fundamentally contextualized, constrained by environmental uncertainty and organizational characteristics, and influenced only to a small extent by individual choice. People's ability to balance these various pressures and influences on procedure following depends in large part on their history and experience. And, as Wright and McCarthy (2003) pointed out, there are currently very few ways in which this experience can be given a legitimate voice in the design of procedures. As chapter 8 shows, a more common way of responding to what is seen as human unreliability is to introduce more automation. Automation has no trouble following algorithms. In fact, it could not run without any. Yet such literalism can be a mixed blessing. Chapter 8 Can We Automate Human Error Out of the System? If people cannot be counted on to follow procedures, should we not simply marginalize human work? Can automation get rid of human unreliability and error? Automation extends our capabilities in many, if not all, transportation modes. In fact, automation is often presented and implemented precisely because it helps systems and people perform better. It may even make operational lives easier: reducing task load, increasing access to information, helping the prioritization of attention, providing reminders, doing work for us where we cannot. What about reducing human error? Many indeed have the expectation that automation will help reduce human error. Just look at some of the evidence: All kinds of transport achieve higher navigation accuracy with satellite guidance; pilots are now able to circumvent pitfalls such as thunderstorms, windshear, mountains, and collisions with other aircraft; and situation awareness improves dramatically with the introduction of moving map displays. So with these benefits, can we automate human error out of the system? The thought behind the question is simple. If we automate part of a task, then the human does not carry out that part. And if the human does not carry out that part, there is no possibility of human error. As a result of this
  • 92. logic, there was a time (and in some quarters there perhaps still is) that automating everything we technically could was considered the best idea. The Air Transport Association of America (ATA) observed, for example, that "during the 1970's and early 1980's . . . the concept of automating as much as possible was considered appropriate" (ATA, 1989, p. 4). It would lead to greater safety, greater capabilities, and other benefits. 151 152 CHAPTER 8 NEW CAPABILITIES, NEW COMPLEXITIES But, really, can we automate human error out of the system? There are problems. With new capabilities come new complexities. We cannot just automate part of a task and assume that the human-machine relationship remains unchanged. Though it may have shifted (with the human doing less and the machine doing more), there is still an interface between humans and technology. And the work that goes on at that interface has likely changed drastically. Increasing automation transforms hands-on operators into supervisory controllers, into managers of a suite of automated and other human resources. With their new work come new vulnerabilities, new error opportunities. With of new interfaces (from pointers to pictures, from single parameter gauges to computer displays) come new pathways to human- machine coordination breakdown. Transportation has witnessed the transformation of work by automation first-hand, and documented its consequences widely. Automation does not do away with what we typically call human error, just as (or precisely because) it does not do away with human work. There is still work to do for people. It is not that the same kinds of errors occur in automated systems as in manual systems. Automation changes the expression of expertise and error; it changes how people can perform well and changes how their performance breaks down, if and when it does. Automation also changes opportunities for error recovery (often not for the better) and in many cases delays the visible consequences of errors. New forms of coordination breakdowns and accidents have emerged as a result. Data Overload Automation does not replace human work. Instead, it changes the work it is designed to support. And with these changes come new burdens. Take system monitoring, for example. There are concerns that automation can create data overload. Rather than taking away cognitive burdens from people, automation introduces new ones, creating new types of monitoring and memory tasks. Because automation does so much, it also can show much (and indeed, there is much to show). If there is much to show, data overload can occur, especially in pressurized, high-workload, or unusual situations. Our ability to make sense of all the data generated by automation has not kept pace with systems' ability to collect, transmit, transform, and present data. But data overload is a pretty complex phenomenon, and there are different ways of looking at it (see Woods, Patterson, & Roth, 2002). For example, we can see it as a workload bottleneck problem. When people experience data overload, it is because of fundamental limits in their internal AUTOMATING HUMAN ERROR AWAY 153 information-processing capabilities. If this is the characterization, then the solution lies in even more automation. More automation, after all, will take work away from people. And taking work away will reduce workload. One area where the workload-reduction solution to the data-overload problem has been applied is in the design of warning systems. It is there that fears of data overload are often most prominent. Incidents in aviation and other transportation modes keep stressing the need for better support of human problem solving during dynamic fault scenarios. People complain of too much data, of illogical presentations, of warnings that interfere with other work, of a lack of order, and of no rhyme or reason to the way in which warnings are presented. Workload reduction during dynamic fault
  • 93. management is so important because problem solvers in dynamic domains need to diagnose malfunctions while maintaining process integrity. Not only must failures be managed while keeping the process running (e.g., keeping the aircraft flying); their implications for the ability to keep the process running in the first place need to be understood and acted on. Keeping the process intact and diagnosing failures are interwoven cognitive demands in which timely understanding and intervention are often crucial. A fault in a dynamic processes typically produces a cascade of disturbances or failures. Modern airliners and high-speed vessels have their systems tightly packed together because there is not much room onboard. Systems are also cross-linked in many intricate ways, with electronic interconnections increasingly common as a result of automation and computerization. This means that failures in one system quickly affect other systems, perhaps even along nonfunctional propagation paths. Failure crossover can occur simply because systems are located next to one another, not because they have anything functional in common. This may defy operator logic or knowledge. The status of single components or systems, then, may not be that interesting for a operator. In fact, it may be highly confusing. Rather, the operator must see, through a forest of seemingly disconnected failures, the structure of the problem so that a solution or countermeasure becomes evident. Also, given the dynamic process managed, which issue should be addressed first? What are the postconditions of these failures for the remainder of operations (i.e., what is still operational, how far can I go, what do I need to reconfigure)? Is there any trend? Are there noteworthy events and changes in the monitored process right now? Will any of this get worse? These are the types of questions that are critical to answer in successful dynamic fault management. Current warning systems in commercial aircraft do not go far in answering these questions, something that is confirmed by pilots' assessments of these systems. For example, pilots comment on too much data, particularly all kinds of secondary and tertiary failures, with no logical order, and primary faults (root causes) that are rarely, if ever, highlighted. The represen- 154 CHAPTER 8 tation is limited to message lists, something that we know hampers operators' visualization of the state of their system during dynamic failure scenarios. Yet not all warning systems are the same. Current warning systems show a range of automated support, from not doing much at all, through prioritizing and sorting warnings, to doing something about the failures, to doing most of the fault management and not showing much at all anymore. Which works best? Is there any merit to seeing data overload as a workload bottleneck problem, and do automated solutions help? An example of a warning system that basically shows everything that goes wrong inside an aircraft's systems, much in order of appearance, is that of the Boeing 767. Messages are presented chronologically (which may mean the primary fault appears somewhere in the middle or even at the bottom of the list) and failure severity is coded through color. A warning system that departs slightly from this baseline is for example the Saab 2000, which sorts the warnings by inhibiting messages that do not require pilot actions. It displays the remaining warnings chronologically. The primary fault (if known) is placed at the top, however, and if a failure results in an automatic system reconfiguration, then this is shown too. The result is a shorter list than the Boeing's, with a primary fault at the top. Next as an example comes the Airbus A320, which has a fully defined logic for warning-message prioritization. Only one failure is shown at the time, together with immediate action items required of the pilot. Subsystem information can be displayed on demand. Primary faults are thus highlighted, together with guidance on how to deal with them. Finally, there is the MD-11, which has the highest degree of autonomy and can respond to failures without asking the pilot to do so. The only exceptions are nonreversible actions (e.g., an engine shutdown). For most failures, the system informs the pilot of system reconfiguration
  • 94. and presents system status. In addition, the system recognizes combinations of failures and gives a common name to these higher order failures (e.g., Dual Generator). As could be expected, response latency on the Boeing 767-type warning system is longest (Singer & Dekker, 2000). It takes a while for pilots to sort through the messages and figure out what to do. Interestingly, they also get it wrong more often on this type of system. That is, they misdiagnose the primary failure more often than on any of the other systems. A nonprioritized list of chronological messages about failures seems to defeat even the speed-accuracy trade-off: Longer dwell times on the display do not help people get it right. This is because the production of speed and accuracy are cognitive: Making sense of what is going wrong inside an aircraft's systems is a demanding cognitive task, where problem representation has a profound influence on people's ability to do it successfully (meaning fast and correct). Modest performance gains (faster responses and fewer misdiagnoses) can be seen on a system like that of the Saab 2000, but the 155 AUTOMATING HUMAN ERROR AWAY Airbus A320 and MD-11 solutions to the workload bottleneck problem really seem to pay off. Performance benefits really accrue with a system that sorts through the failures, shows them selectively, and guides the pilot in what to do next. In our study, pilots were quickest to identify the primary fault in the failure scenario with such a system, and made no misdiagnoses in assessing what it was (Singer & Dekker). Similarly, a warning system that itself contains or counteracts many of the failures and shows mainly what is left to the pilot seems to help people in quickly identifying the primary fault. These results, however, should not be seen as justification for simply automating more of the failure-management task. Human performance difficulties associated with high-automation participation in difficult or novel circumstances are well known, such as brittle procedure following where operators follow heuristic cues from the automation rather than actively seeking and dealing with information related to the disturbance chain. Instead, these results indicate how progress can be made by changing the representational quality of warning systems altogether, not just by automating more of the human task portion. If guidance is beneficial, and if knowing what is left is useful, then the results of this study tell designers of warning systems to shift to another view of referents (the thing in the process that the symbol on the display refers to). Warning-system designers would have to get away from relying on single systems and their status as referents to show on the display, and move toward referents that fix on higher order variables that carry more meaning relative to the dynamic fault-management task. Referents could integrate current status with future predictions, for example, or cut across single parameters and individual systems to reveal the structure behind individual failures and show consequences in terms that are operationally immediately meaningful (e.g., loss of pressure, loss of thrust). Another way of looking at data overload is as a clutter problem—there is simply too much on the display for people to cope with. The solution to data overload as a clutter problem is to remove stuff from the display. In warning-system design, for example, this may result in guidelines that stress how no more than a certain number of lines must be filled up on a warning screen. Seeing data overload as clutter, however, is completely insensitive of context. What seems clutter in one situation may be highly valuable, or even crucial, in another situation. The crash of an Airbus A330 during a test flight at the factory field in Toulouse, France in 1994 provides a good demonstration of this (see Billings, 1997). The aircraft was on a certification test flight to study various pitch-transition control laws and how they worked during an engine failure at low altitude, in a lightweight aircraft with a rearward center of gravity (CG). The flight crew included a highly experienced test pilot, a copilot, a flight-test engineer, and three passengers. Given the
  • 95. 156 CHAPTER 8 lightweight and rearward CG, the aircraft got off the runway quickly and easily and climbed rapidly, with a pitch angle of almost 25° nose-up. The autopilot was engaged 6 seconds after takeoff. Immediately after a short climb, the left engine was brought to idle power and one hydraulic system was shut down in preparation for the flight test. Now the autopilot had to simultaneously manage a very low speed, an extremely high angle of attack, and asymmetrical engine thrust. After the captain disconnected the autopilot (this was only 19 seconds after takeoff) and reduced power on the right engine to regain control of the aircraft, even more airspeed was lost. The aircraft stalled, lost altitude rapidly, and crashed 36 seconds after takeoff. When the airplane reached a 25° pitch angle, autopilot and flightdirector mode information were automatically removed from the primary flight display in front of the pilots. This is a sort of declutter mode. It was found that, because of the high rate of ascent, the autopilot had gone into altitude-acquisition mode (called ALT* in the Airbus) shortly after takeoff. In this mode there is no maximum pitch protection in the autoflight system software (the nose can go as high as the autopilot commands it to go, until the laws of aerodynamics intervene). In this case, at low speed, the autopilot was still trying to acquire the altitude commanded (2,000 feet), pitching up to it, and sacrificing airspeed in the process. But ALT* was not shown to the pilots because of the declutter function. So the lack of pitch protection was not announced, and may not have been known to them. Declutter has not been a fruitful or successful way of trying to solve data overload (see Woods et al., 2002), precisely because of the context problem. Reducing data elements on one display calls for that knowledge to be represented or retrieved elsewhere (people may need to pull it from memory instead), lest it be altogether unavailable. Merely seeing data overload as a workload or clutter problem is based on false assumptions about how human perception and cognition work. Questions about maximum human data-processing rates are misguided because this maximum, if there is one at all, is highly dependent on many factors, including people's experience, goals, history, and directed attention. As alluded to earlier in the book, people are not passive recipients of observed data; they are active participants in the intertwined processes of observation, action, and sense making. People employ all kinds of strategies to help manage data, and impose meaning on it. For example, they redistribute cognitive work (to other people, to artifacts in the world), they rerepresent problems themselves so that solutions or countermeasures become more obvious. Clutter and workload characterizations treat data as a unitary input phenomenon, but people are not interested in data, they are interested in meaning. And what is meaningful in one situation may not be meaningful in the next. Declutter functions are context insensitive, as are workloadreduction measures. What is interesting, or meaningful, depends on con- 157 AUTOMATING HUMAN ERROR AWAY text. This makes designing a warning or display system highly challenging. How can a designer know what the interesting, meaningful or relevant pieces of data will be in a particular context? This takes a deep understanding of the work as it is done, and especially as it will be done once the new technology has been implemented. Recent advances in cognitive work analysis (Vicente, 1999) and cognitive task design (Hollnagel, 2003) presented ways forward, and more is said about such envisioning of future work toward the end of this chapter. Adapting to Automation, Adapting the Automation In addition to knowing what (automated) systems are doing, humans are also required to provide the automation with data about the world. They need to input things. In fact, one role for people in automated systems is to bridge the context gap. Computers are dumb and dutiful: They will do what they are programmed to do, but their access to context, to a wider environment, is limited—limited, in fact, to what has been predesigned or preprogrammed
  • 96. into them. They are literalist in how they work. This means that people have to jump in to fill a gap: They have to bridge the gulf between what the automation knows (or can know) and what really is happening or relevant out there in the world. The automation, for example, will calculate an optimal descent profile in order to save as much fuel as possible. But the resulting descent may be too steep for crew (and passenger) taste, so pilots program in an extra tailwind, tricking the computers into descending earlier and eventually more shallow (because the tailwind is fictitious). The automation does not know about this context (preference for certain descent rates over others), so the human has to bridge the gap. Such tailoring of tools is a very human thing to do: People will shape tools to fit the exact task they must fulfill. But tailoring is not risk- or problem-free. It can create additional memory burdens, impose cognitive load when people cannot afford it, and open up new error opportunities and pathways to coordination breakdowns between human and machine. Automation changes the task for which it was designed. Automation, though introducing new capabilities, can increase task demands and create new complexities. Many of these effects are in fact unintended by the designers. Also, many of these side effects remain buried in actual practice and are hardly visible to those who only look for the successes of new machinery. Operators who are responsible for (safe) outcomes of their work are known to adapt technology so that it fits their actual task demands. Operators are known to tailor their working strategies so as to insulate themselves from the potential hazards associated with using the technology. This means that the real effects of technology change can remain hidden beneath a smooth layer of adaptive performance. Operational people will 158 CHAPTER 8 make it work, no matter how recalcitrant or ill suited to the domain the automation, and its operating procedures, really may be. Of course, the occasional breakthroughs in the form of surprising accidents provide a window onto the real nature of automation and its operational consequences. But such potential lessons quickly glide out of view under the pressure of the fundamental surprise fallacy. Apparently successful adaptation by people in automated systems, though adaptation in unanticipated ways, can be seen elsewhere in how pilots deal with automated cockpits. One important issue on high-tech flight decks is knowing what mode the automation is in (this goes for other applications such as ship's bridges too: Recall the Royal Majesty from chap. 5). Mode confusion can lie at the root of automation surprises, with people thinking that they told the automation to do one thing whereas it was actually doing another. How do pilots keep track of modes in an automated cockpit? The formal instrument for tracking and checking mode changes and status is the FMA, or flight-mode annunciator, a small strip that displays contractions or abbreviations of modes (e.g., Heading Select mode is shown as HDG or HDG SEL) in various colors, depending on whether the mode is armed (i.e., about to become engaged) or engaged. Most airline procedures require pilots to call out the mode changes they see on the FMA. One study monitored flight crews during a dozen return flights between Amsterdam and London on a full flight simulator (Bjorklund, Alfredsson, & Dekker, 2003). Where both pilots were looking and how long was measured by EPOG (eye-point-of-gaze) equipment, which uses different kinds of techniques ranging from laser beams to measuring and calibrating saccades, or eye jumps that can track the exact focal point of a pilot's eyes in a defined visual field (see Fig. 8.1). Pilots do not look at the FMA much at all. And they talk even less about it. Very few call-outs are made the way they should be (according to the procedures). Yet this does not seem to have an effect on automation-mode awareness, nor on the airplane's flight path. Without looking or talking, most pilots apparently still know what is going on inside the automation. In this one study, 521 mode changes occurred during the 12 flights. About
  • 97. 60% of these were pilot induced (i.e., because of the pilot changing a setting in the automation), the rest were automation induced. Two out of five mode changes were never visually verified (meaning neither pilot looked at their FMA during 40% of all mode changes). The pilot flying checked a little less than the pilots not flying, which could be a natural reflection of the role division: Pilots who are flying the aircraft have other sources of flightrelated data they need to look at, whereas the pilot not flying can oversee the entire process, thereby engaging more often in checks of what the automation modes are. There are also differences between captains and first officers as well (even after you correct for pilot-flying vs. pilot-not-flying 159 AUTOMATING HUMAN ERROR AWAY FIG. 8.1. Example of pilot EPOG (eye point of gaze) fixations on a primary flight display (PFD) and map display in an automated cockpit. The top part of the PFD is the flight-mode annunciator (FMA; Bjorklund et al., 2003). roles). Captains visually verified the transitions in 72% of the cases, versus 47% for first officers. This may mirror the ultimate responsibility that captains have for safety of flight, yet there was no expectation that this would translate into such concrete differences in automation monitoring. Amount of experience on automated aircraft types was ruled out as being responsible for the difference. Of 512 mode changes, 146 were called out. If that does not seem like much, consider this: Only 32 mode changes (that is about 6%) were called out after the pilot looked at the FMA. The remaining call-outs came either before looking at the FMA, or without looking at the FMA at all. Such a disconnect between seeing and saying suggests that there are other cues that pilots use to establish what the automation is doing. The FMA does not serve as a major trigger for getting pilots to call out modes. Two out of five mode transitions on the FMA are never even seen by entire flight crews. In contrast to instrument monitoring in nonglass-cockpit aircraft, monitoring for mode transitions is based more on a pilot's mental model of the automation (which drives expectations of where and when to look) and an understanding of what the current situation calls for. Such models are often incomplete and buggy and it is not surprising that many mode transitions are neither visually nor verbally verified by flight crews. At the same time, a substantial number of mode transitions are actually anticipated correctly by flight crews. In those cases where pilots do call out a mode change, four out of five visual identifications of those mode changes are accompanied or preceded by a verbalization of their occurrence. This suggests that there are multiple, underinvestigated resources that pilots rely 160 CHAPTER8 on for anticipating and tracking automation-mode behavior (including pilot mental models). The FMA, designed as the main source of knowledge about automation status, actually does not provide a lot of that knowledge. It triggers a mere one out of five call-outs, and gets ignored altogether by entire crews for a whole 40% of all mode transitions. Proposals for new regulations are unfortunately taking shape around the same old display concepts. For example, Joint Advisory Circular ACJ 25.1329 (Joint Aviation Authorities, 2003, p. 28) said that: "The transition from an armed mode to an engaged mode should provide an additional attention-getting feature, such as boxing and flashing on an electronic display (per AMJ25-11) for a suitable, but brief, period (e.g., ten seconds) to assist in flight crew awareness." But flight-mode annunciators are not at all attention getting, whether there is boxing or flashing or not. Indeed, empirical data show (as it has before, see Mumaw, Sarter, & Wickens, 2001) that the FMA does not "assist in flight crew awareness" in any dominant or relevant way. If design really is to capture crew's attention about automation status and behavior, it will have to do radically better than annunciating abstruse codes in various hues and boxing or flashing times. The call-out procedure appears to be miscalibrated with respect to real work in a real cockpit, because pilots basically do not follow formal verification
  • 98. and call-out procedures at all. Forcing pilots to visually verify the FMA first and then call out what they see bears no similarity to how actual work is done, nor does it have much sensitivity to the conditions under which such work occurs. Call-outs may well be the first task to go out the window when workload goes up, which is also confirmed by this type of research. In addition to the few formal call-outs that do occur, pilots communicate implicitly and informally about mode changes. Implicit communication surrounding altitude capture could for example be "Coming up to one-three-zero, (capture)" (referring to flight level 130). There appear to be many different strategies to support mode awareness, and very few of them actually overlap with formal procedures for visual verification and call-outs. Even during the 12 flights of the Bjorklund et al. (2003) study, there were at least 18 different strategies that mixed checks, timing, and participation. These strategies seem to work as well as, or even better than, the official procedure, as crew communications on the 12 flights revealed no automation surprises that could be traced to a lack of mode awareness. Perhaps mode awareness does not matter that much for safety after all. There is an interesting experimental side effect here: If mode awareness is measured mainly by visual verification and verbal call-outs, and crews neither look nor talk, then are they unaware of modes, or are the researchers unaware of pilots' awareness? This poses a puzzle: Crews who neither talk nor look can still be aware of the mode their automation is in, and this, indeed seems to be the case. But how, in that case, is the researcher (or your 161 AUTOMATING HUMAN ERROR AWAY company, or line-check pilot) to know? The situation is one answer. By simply looking at where the aircraft is going, and whether this overlaps with the pilots' intentions, an observer can get to know something about apparent pilot awareness. It will show whether pilots missed something or not. In the research reported here however, pilots missed nothing: There were no unexpected aircraft behaviors from their perspective (Bjorklund et al., 2003). This can still mean that the crews were either not aware of the modes and it did not matter, or they were aware but the research did not capture it. Both may be true. MABA-MABA OR ABRACADABRA The diversity of experiences and research results from automated cockpits shows that automation creates new capabilities and complexities in ways that may be difficult to anticipate. People adapt to automation in many different ways, many of which have little resemblance to formally established procedures for interacting with the automation. Can automation, in a very Cartesian, dualistic sense, replace human work, thereby reducing human error? Or is there a more complex coevolution of people and technology? Engineers and others involved in automation development are often led to believe that there is a simple answer, and in fact a simple way of getting the answer. MABA-MABA lists, or "Men-Are-Better-At, Machines-Are-Better-At" lists have appeared over the decades in various guises. What these lists basically do is try to enumerate the areas of machine and human strengths and weaknesses, in order to provide engineers with some guidance on which functions to automate and which ones to give to the human. The process of function allocation as guided by such lists sounds straightforward, but is actually fraught with difficulty and often unexamined assumptions. One problem is that the level of granularity of functions to be considered for function allocation is arbitrary. For example, it depends on the model of information processing on which the MABA-MABA method is based (Hollnagel, 1999). In Parasuraman, Sheridan, and Wickens (2000), four stages of information processing (acquisition, analysis, selection, response) form the guiding principle to which functions should be kept or given away, but this is an essentially arbitrary decomposition based on a notion of a human-machine ensemble that resembles a linear input-output device. In cases where it is not a model of information processing that determines the categories of functions to be swapped between human and
  • 99. machine, the technology itself often determines it (Hollnagel, 1999). MABA-MABA attributes are then cast in mechanistic terms, derived from technological metaphors. For example, Fitts (1951) applied terms such as information capacity and computation in his list of attributes for both the 162 CHAPTER8 human and the machine. If the technology gets to pick the battlefield (i.e., determine the language of attributes) it will win most of them back for itself. This results in human-uncentered systems where typically heuristic and adaptive human abilities such as not focusing on irrelevant data, scheduling and reallocating activities to meet current constraints, anticipating events, making generalizations and inferences, learning from past experience, and collaborating (Hollnagel) easily fall by the wayside. Moreover, MABA-MABA lists rely on a presumption of fixed human and machine strengths and weaknesses. The idea is that, if you get rid of the (human) weaknesses and capitalize on the (machine) strengths, you will end up with a safer system. This is what Hollnagel (1999) called "function allocation by substitution." The idea is that automation can be introduced as a straightforward substitution of machines for people—preserving the basic system while improving some of its output measures (lower workload, better economy, fewer errors, higher accuracy, etc.). Indeed, Parasuraman et al. (2000) recently defined automation in this sense: "Automation refers to the full or partial replacement of a function previously carried out by the human operator" (p. 287). But automation is more than replacement (although perhaps automation is about replacement from the perspective of the engineer). The really interesting issues from a human performance standpoint emerge after such replacement has taken place. Behind the idea of substitution lies the idea that people and computers (or any other machines) have fixed strengths and weaknesses and that the point of automation is to capitalize on the strengths while eliminating or compensating for the weaknesses. The problem is that capitalizing on some strength of computers does not replace a human weakness. It creates new human strengths and weaknesses—often in unanticipated ways (Bainbridge, 1987). For instance, the automation strength to carry out long sequences of action in predetermined ways without performance degradation amplifies classic human vigilance problems. It also exacerbates the system's reliance on the human strength to deal with the parametrization problem, or literalism (automation does not have access to all relevant world parameters for accurate problem solving in all possible contexts). As we have seen, however, human efforts to deal with automation literalism, by bridging the context gap, may be difficult because computer systems can be hard to direct (How do I get it to understand? How do I get it to do what I want?). In addition, allocating a particular function does not absorb this function into the system without further consequences. It creates new functions for the other partner in the human-machine equation—functions that did not exist before, for example, typing, or searching for the right display page, or remembering entry codes. The quest for a priori function allocation, in other words, is intractable (Hollnagel & Woods, 1983), and not 163 AUTOMATING HUMAN ERROR AWAY only this: Such new kinds of work create new error opportunities (What was that code again? Why can't I find the right page?). TRANSFORMATION AND ADAPTATION Automation produces qualitative shifts. Automating something is not just a matter of changing a single variable in an otherwise stable system (Woods & Dekker, 2001). Automation transforms people's practice and forces them to adapt in novel ways: "It alters what is already going on—the everyday practices and concerns of a community of people—and leads to a resettling into new practices" (Flores, Graves, Hartfield, & Winograd, 1988, p. 154). Unanticipated consequences are the result of these much more profound, qualitative shifts. For example, during the Gulf War in the early 1990s, "almost without exception, technology did not meet the goal of unencumbering
  • 100. the personnel operating the equipment. Systems often required exceptional human expertise, commitment, and endurance" (Cordesman & Wagner, 1996, p. 25). Where automation is introduced, new human roles emerge. Engineers, given their professional focus, may believe that automation transforms the tools available to people, who will then have to adapt to these new tools. In chapter 9 we see how, according to some researchers, the removal of paper flight-progress strips in air-traffic control represents a transformation of the workplace, to which controllers only need to adapt (they will compensate for the lack of flight progress strips). In reality, however, people's practice gets transformed by the introduction of new tools. New technology, in turn, gets adapted by people in locally pragmatic ways so that it will fit the constraints and demands of actual practice. For example, controlling without flight-progress strips (relying more on the indications presented on the radar screen) asks controllers to develop and refine new ways of managing airspace complexity and dynamics. In other words, it is not the technology that gets transformed and the people who adapt. Rather, people's practice gets transformed and they in turn adapt the technology to fit their local demands and constraints. The key is to accept that automation will transform people's practice and to be prepared to learn from these transformations as they happen. This is by now a common (but not often successful) starting point in contextual design. Here the main focus of system design is not the creation of artifacts per se, but getting to understand the nature of human practice in a particular domain, and changing those work practices rather than just adding new technology or replacing human work with machine work. This recognizes that: 164 CHAPTER 8 • Design concepts represent hypotheses or beliefs about the relationship between technology and human cognition and collaboration. • They need to subject these beliefs to empirical jeopardy by a search for disconfirming and confirming evidence. • These beliefs about what would be useful have to be tentative and open to revision as they learn more about the mutual shaping that goes on between artifacts and actors in a field of practice. Subjecting design concepts to such scrutiny can be difficult. Traditional validation and verification techniques applied to design prototypes may turn up nothing, but not necessarily because there is nothing that could turn up. Validation and verification studies typically try to capture small, narrow outcomes by subjecting a limited version of a system to a limited test. The results can be informative, but hardly about the processes of transformation (different work, new cognitive and coordination demands) and adaptation (novel work strategies, tailoring of the technology) that will determine the sources of a system's success and potential for failure once it has been fielded. Another problem is that validation and verification studies need a reasonably ready design in order to carry any meaning. This presents a dilemma: By the time results are available, so much commitment (financial, psychological, organizational, political) has been sunk into the particular design that any changes quickly become unfeasible. Such constraints through commitment can be avoided if human factors can say meaningful things early on in a design process. What if the system of interest has not been designed or fielded yet? Are there ways in which we can anticipate whether automation, and the human role changes it implies, will create new error problems rather than simply solving old ones? This has been described as Newell's catch: In order for human factors to say meaningful things about a new design, the design needs to be all but finished. Although data can then be generated, they are no longer of use, because the design is basically locked. No changes as a result of the insight created by human factors data are possible anymore. Are there ways around this catch? Can human factors say meaningful things about a design
  • 101. that is nowhere near finished? One way that has been developed is future incident studies, and the concept they have been tested on is exception management. AUTOMATION AND EXCEPTION MANAGEMENT One role that may fit the human well is that of exception manager. Introducing automation to turn people into exception managers can sound like a good idea. In ever busier systems, where operators are vulnerable to prob- 165 AUTOMATING HUMAN ERROR AWAY lems of data overload, turning humans into exception managers is a powerfully attractive concept. It has, for example, been practiced in the dark cockpit design that essentially keeps the human operator out of the loop (all the annunciator lights are out in normal operating conditions) until something interesting happens, which may then be the time for the human to intervene. This same envisioned role, of exception manager, dominates recent ideas about how to effectively let humans control ever increasing airtraffic loads. Perhaps, the thought goes, controllers should no longer be in charge of all the parameters of every flight in their sector. A core argument is that the human controller is a limiting factor in traffic growth. Too many aircraft under one single controller leads to memory overload and the risk of human error. Decoupling controllers from all individual flights in their sectors, through greater computerization and automation on the ground and greater autonomy in the air, is assumed to be the way around this limit. The reason we may think that human controllers will make good exception managers is that humans can handle the unpredictable situations that machines cannot. In fact, this is often a reason why humans are still to be found in automated systems in the first place (see Bainbridge, 1987). Following this logic, controllers would be very useful in the role of traffic manager, waiting for problems to occur in a kind of standby mode. The view of controller practice is one of passive observer, ready to act when necessary. But intervening effectively from a position of disinvolvement has proven to be difficult—particularly in air-traffic control. For example, Endsley, Mogford, Allendoerfer, and Stein (1997) pointed out, in a study of direct routings that allowed aircraft deviations without negotiations, that with more freedom of action being granted to individual aircraft, it became more difficult for controllers to keep up with traffic. Controllers were less able to predict how traffic patterns would evolve over a foreseeable timeframe. In other studies too, passive monitors of traffic seemed to have trouble maintaining a sufficient understanding of the traffic under their control (Galster, Duley, Masolanis, & Parasuraman, 1999), and were more likely to overlook separation infringements (Metzger & Parasuraman, 1999). In one study, controllers effectively gave up control over an aircraft with communication problems, leaving it to other aircraft and their collision- avoidance systems to sort it out among themselves (Dekker & Woods, 1999). This turned out to be the controllers' only route out of a fundamental double bind: If they intervened early they would create a lot of workload problems for themselves (suddenly a large number of previously autonomous aircraft would be under their control). Yet if they waited on intervention (in order to gather more evidence on the aircraft's intentions), they would also end up with an unmanageable workload and very little time to solve anything in. Controller disinvolvement can create more work rather than less, and produce a greater error potential. 166 CHAPTER 8 This brings out one problem of envisioning practice, of anticipating how automation will create new human roles and what the performance consequences of those roles will be. Just saying "manager of exceptions" is insufficient: It does not make explicit what it means to practice. What work does an exception manager do? What cues does he or she base decisions on? The downside of underspecification is the risk of remaining trapped in a disconnected, shallow, unrealistic view of work. And when our view of (future) practice is disconnected from many of the pressures, challenges, and
  • 102. constraints operating in that world, our view of practice is distorted from the beginning. It misses how operational people's strategies are often intricately adapted to deal effectively with these constraints and pressures. There is an upside to underspecification, however, and that is the freedom to explore new possibilities and new ways to relax and recombine the multiple constraints, all in order to innovate and improve. Will automation help you get rid of human error? With air-traffic controllers as exception managers, it is interesting to think about how the various designable objects would be able to support them in exception management. For example, visions of future air-traffic control systems typically include data linking as an advance that avoids the narrow bandwidth problem of voice communications —thus enhancing system capacity. In one study (Dekker & Woods, 1999), a communications failure affected an aircraft that had also suffered problems with its altitude reporting (equipment that tells controllers how high it is and whether it is climbing or descending). At the same time, this aircraft was headed for streams of crossing air traffic. Nobody knew exactly how datalink, another piece of technology not connected to altitudeencoding equipment, would be implemented (its envisioned use was, and is, to an extent underspecified). One controller, involved in the study, had the freedom to suggest that air-traffic control should contact the airline's dispatch or maintenance office to see whether the aircraft was climbing or descending or level. After all, data link could be used by maintenance and dispatch personnel to monitor the operational and mechanical status of an aircraft, so "if dispatch monitors power settings, they could tell us," the controller suggested. Others objected because of the coordination overheads this would create. The ensuing discussion showed that, in thinking about future systems and their consequences for human error, we can capitalize on underspecification if we look for the so-called leverage points (in the example: data link and other resources in the system) and a sensitivity to the fact that envisioned objects only become tools through use—imagined or real (data links to dispatch become a backup air-traffic control tool). Anticipating the consequences of automation on human roles is also difficult because—without a concrete system to test—there are always multiple versions of how the proposed changes will affect the field of practice in the future. Different stakeholders (in air-traffic control this would be air carri- 167 AUTOMATING HUMAN ERROR AWAY ers, pilots, dispatchers, air-traffic controllers, supervisors, flow controllers) have different perspectives on the impact of new technology on the nature of practice. The downside of this plurality is a kind of parochialism where people mistake their partial, narrow view for the dominant view of the future of practice, and are unaware of the plurality of views across stakeholders. For example, one pilot claimed that greater autonomy for airspace users is "safe, period" (Baiada, 1995). The upside of plurality is the triangulation that is possible when the multiple views are brought together. In examining the relationships, overlaps, and gaps across multiple perspectives, we are better able to cope with the inherent uncertainty built into looking into the future. A number of future incident studies (see Dekker & Woods, 1999) examined controllers' anomaly response in future air-traffic control worlds precisely by capitalizing on this plurality. To study anomaly response under envisioned conditions, groups of practitioners (controllers, pilots, and dispatchers) were trained on proposed future rules. They were brought together to try to apply these rules in solving difficult future airspace problems that were presented to them in several scenarios. These included aircraft decompression and emergency descents, clear air turbulence, frontal thunderstorms, corner-post overloading (too many aircraft going to one entry point for airport area), and priority air-to-air refueling and consequent airspace restrictions and communication failures. These challenges, interestingly, were largely rule or technology independent: They can happen in airspace systems of any generation. The point was not to test the
  • 103. anomaly response performance of one group against that of another, but to use triangulation of multiple stakeholder viewpoints—anchored in the task details of a concrete problem—to discover where the envisioned system would crack, where it would break down. Validity in such studies derives from: (a) the extent to which problems to be solved in the test situation represent the vulnerabilities and challenges that exist in the target world, and (b) the way in which real problem-solving expertise is brought to bear by the study participants. Developers of future air-traffic control architectures have been envisioning a number of predefined situations that call for controller intervention, a kind of reasoning that is typical for engineering-driven decisions about automated systems. In air-traffic management, for example, potentially dangerous aircraft maneuvers, local traffic density (which would require some density index), or other conditions that compromise safety would make it necessary for a controller to intervene. Such rules, however, do not reduce uncertainty about whether to intervene. They are all a form of threshold crossing—intervention is called for when a certain dynamic density has been reached or a number of separation miles has been transgressed. But threshold-crossing alarms are very hard to get right—they 168 CHAPTER 8 come either too early or too late. If too early, a controller will lose interest in them: The alarm will be deemed alarmist. If the alarm comes too late, its contribution to flagging or solving the problem will be useless and it will be deemed incompetent. The way in which problems in complex, dynamic worlds grow and escalate, and the nature of collaborative interactions, indicate that recognizing exceptions in how others (either machines or people) are handling anomalies is complex. The disappointing history of automating problem diagnosis inspires little further hope. Threshold-crossing alarms cannot make up for a disinvolvement—they can only make a controller acutely aware of those situations in which it would have been nice to have been involved from the start. Future incident studies allow us to extend the empirical and theoretical base on automation and human performance. For example, supervisorycontrol literature makes no distinction between anomalies and exceptions. This indistinction results from the source of supervisory-control work: How do people control processes over physical distances (time lag, lack of access, etc.). However, air-traffic control augments the issue of supervisory control with a cognitive distance: Airspace participants have some system knowledge and operational perspective, as do controllers, but there are only partial overlaps and many gaps. Studies on exception management in future air-traffic control force us to make a distinction between anomalies in the process, and exceptions from the point of view of the supervisor (controller). Exceptions can arise in cases where airspace participants are dealing with anomalies (e.g., an aircraft with pressurization or communications problems) in a way that forces the controller to intervene. An exception is a judgement about how well others are handling or going to handle disturbances in the process. Are airspace participants handling things well? Are they going to get themselves in trouble in the future? Judging whether airspace users are going to get in trouble in their dealings with a process disturbance would require a controller to recognize and trace a situation over time—contradicting arguments that human controllers make good standby interveners. Will Only the Predicted Consequences Occur? In developing new systems, it is easy for us to become miscalibrated. It is easy for us to become overconfident that if our envisioned system can be realized, the predicted consequences and only the predicted consequence will occur. We lose sight of the fact that our views of the future are tentative hypotheses and that we would actually need to remain open to revision, that we need to continually subject these hypotheses to empirical jeopardy.
  • 104. 169 AUTOMATING HUMAN ERROR AWAY One way to fool ourselves into thinking that only the predicted consequences will occur when we introduce automation is to stick with substitutional practice of function allocation. Substitution assumes a fundamentally uncooperative system architecture in which the interface between human and machine has been reduced to a straightforward "you do this, I do that" trade. If that is what it is, of course we should be able to predict the consequences. But it is not that simple. The question for successful automation is not who has control over what or how much. That only looks at the first parts, the engineering parts. We need to look beyond this and start asking humans and automation the question: "How do we get along together?" Indeed, where we really need guidance today is in how to support the coordination between people and automation. In complex, dynamic, nondeterministic worlds, people will continue to be involved in the operation of highly automated systems. The key to a successful future of these systems lies in how they support cooperation with their human operators—not only in foreseeable standard situations, but also under novel, unexpected circumstances. One way to frame the question is how to turn automated systems into effective team players (Sarter & Woods, 1997). Good team players make their activities observable to fellow team players, and are easy to direct. To be observable, automation activities should be presented in ways that capitalize on well-documented human strengths (our perceptual system's acuity to contrast, change and events, our ability to recognize patterns and know how to act on the basis of this recognition, e.g., Klein). For example: • Event based: Representations need to highlight changes and events in ways that the current generation of state-oriented displays do not. • Future oriented: In addition to historical information, human operators in dynamic systems need support for anticipating changes and knowing what to expect and where to look next. • Pattern based: Operators must be able to quickly scan displays and pick up possible abnormalities without having to engage in difficult cognitive work (calculations, integrations, extrapolations of disparate pieces of data). By relying on pattern- or form-based representations, automation has an enormous potential to convert arduous mental tasks into straightforward perceptual ones. Team players are directable when the human operator can easily and efficiently tell them what to do. Designers could borrow inspiration from how practitioners successfully direct other practitioners to take over work. These are intermediate, cooperative modes of system operation that allow 170 CHAPTER 8 human supervisors to delegate suitable subproblems to the automation, just as they would be delegated to human crew members. The point is not to make automation into a passive adjunct to the human operator who then needs to micromanage the system each step of the way. This would be a waste of resources, both human and machine. Human operators must be allowed to preserve their strategic role in managing system resources as they see fit, given the circumstances. Chapter 9 Will the System Be Safe? How do you know whether a new system will be safe? As chapter 8 showed, automating parts of human work may make a system safer, but they may not. The Alaska Airlines 261 accident discussed in chapter 2 illustrates how difficult it is to know whether a system is going to be safe during its operational lifetime. In the case of the DC-9 trim system, bridging the gap between producing a system and running it proved quite difficult. Certifying that the system was safe, or airworthy, when it rolled out of the factory with zero flying hours was one thing. Certifying that it would stay safe during a projected lifetime proved to be quite another. Alaska 261 shows how large the gulf between making a system and maintaining it can be. The same is true for sociotechnical systems. Take the issue of flightprogress
  • 105. strips in air-traffic control. The flight strip is a small paper slip with flight-plan data about each controlled aircraft's route, speed, altitude, times over waypoints, and other characteristics (see Fig. 9.1). It is used by air-traffic controllers in conjunction with a radar representation of air traffic. A number of control centers around the world are doing away with these paper strips, to replace them with automated flighttracking systems. Each of these efforts requires, in principle, a rigorous certification process. Different teams of people look at color coding, letter size and legibility, issues of human-computer interaction, software reliability and stability, seating arrangements, button sensitivities, and so forth, and can spend a decade following the footsteps of a design process to probe and poke it with methods and forms and questionnaires and tests and checklists and tools and guidelines—all in an effort to ensure that local human factors or ergonomics standards have been met. But such static 171 172 CHAPTER 9 SKA 9337 7351 320 310 OSY FUR TEH 1936 1946 1952 FIG. 9.1. Example of a flight strip. From left to right it shows the airplane's flight number and transponder code, the entry altitude of the aircraft into the controller's sector (FL320), the exit level (FL310), and what times it is expected to fly across particular waypoints along its route in the controller's secsnapshots may mean little. A lineup of microcertificates of usability does not guarantee safety. As soon as they hit the field of practice, systems start to drift. A year (or a month) after its inception, no sociotechnical system is the same as it was in the beginning. As soon as a new technology is introduced, the human, operational, organizational system that is supposed to make the technology work forces it into locally practical adaptations. Practices (procedures, rules) adapt around the new technology, and the technology in turn is reworked, revised, and amended in response to the emergence of practical experience. THE LIMITS OF SAFETY CERTIFICATION System safety is more than the sum of the certified parts. A redundant torque tube inside of a jackscrew, for example, does nothing to maintain the integrity of a DC-9 trim system without a maintenance program that guarantees continued operability. But ensuring the existence of such a maintenance system is nothing like understanding how the local rationality of such a system can be sustained (we're doing the right thing, the safe thing) while safety standards are in fact continually being eroded (e.g., from 350- to 2,550-hour lubrication interval). The redundant components may have been built and certified. The maintenance program (with 2,550- hour lubrication intervals—certified) may be in place. But safe parts do not guarantee system safety. Certification processes do not typically take lifetime wear of parts into account when judging an aircraft airworthy, even if such wear will render an aircraft, like Alaska 261, quite unworthy of flying. Certification processes certainly do not know how to take sociotechnical adaptation of new equipment, and the consequent potential for drift into failure, into account when looking at nascent technologies. Systemic adaptation or wear is not a criterion in certification decisions, nor is there a requirement to put in place an organization to prevent or cover for anticipated wear rates or pragmatic adaptation, or fine-tuning. As a certification engineer from the regu- 173 WILL THE SYSTEM BE SAFE? lator testified, "Wear is not considered as a mode of failure for either a system safety analysis or for structural considerations" (NTSB, 2002, p. 24). Because how do you take wear into account? How can you even predict with any accuracy how much wear will occur? McDonnell-Douglas surely had it wrong when it anticipated wear rates on the trim jackscrew assembly of its DC-9. Originally, the assembly was designed for a service life of 30,000 flight hours without any periodic inspections for wear. But within a year, excessive
  • 106. wear had been discovered nonetheless, prompting a reconsideration. The problem of certifying a system as safe to use can become even more complicated if the system to be certified is sociotechnical and thereby even less calculable. What does wear mean when the system is sociotechnical rather than consisting of pieces of hardware? In both cases, safety certification should be a lifetime effort, not a still assessment of decomposed system status at the dawn of a nascent technology. Safety certification should be sensitive to the coevolution of technology and its use, its adaptation. Using the growing knowledge base on technology and organizational failure, safety certification could aim for a better understanding of the ecology in which technology is released—the pressures, resource constraints, uncertainties, emerging uses, fine-tuning, and indeed lifetime wear. Safety certification is not just about seeing whether components meet criteria, even if that is what it often practically boils down to. Safety certification is about anticipating the future. Safety certification is about bridging the gap between a piece of gleaming new technology in the hand now, and its adapted, coevolved, grimy, greased-down wear and use further down the line. But we are not very good at anticipating the future. Certification practices and techniques oriented toward assessing the standard of current components do not translate well into understanding total system behavior in the future. Making claims about the future, then, often hangs on things other than proving the worthiness of individual parts. Take the trim system of the DC-9 again. The jackscrew in the trim assembly had been classified as a "structure" in the 1960s, leading to different certification requirements from when it would have been seen as a system. The same piece of hardware, in other words, could be looked at as two entirely different things: a system, or a structure. In being judged a structure, it did not have to undergo the required system safety analysis (which may, in the end, still not have picked up on the problem of wear and the risks it implied). The distinction, this partition of a single piece of hardware into different lexical labels, however, shows that airworthiness is not a rational product of engineering calculation. Certification can have much more to do with localized engineering judgments, with argument and persuasion, with discourse and renaming, with the translation of numbers into opinion, and opinion into numbers— all of it based on uncertain knowledge. 174 CHAPTER9 As a result, airworthiness is an artificially binary black-or-white verdict (a jet is either airworthy or it is not) that gets imposed on a very grey, vague, uncertain world—a world where the effects of releasing a new technology into actual operational life are surprisingly unpredictable and incalculable. Dichotomous, hard yes or no meets squishy reality and never quite gets a genuine grip. A jet that was judged airworthy, or certified as safe, may or may not be in actual fact. It may be a little bit unairworthy. Is it still airworthy with an end-play check of .0042 inches, the set limit? But "set" on the basis of what? Engineering judgment? Argument? Best guess? Calculations? What if a following end-play check is more favorable? The end-play check itself is not very reliable. The jet may be airworthy today, but no longer tomorrow (when the jackscrew snaps). But who would know? The pursuit of answers to such questions can precede or accompany certification efforts. Research, that putatively objective scientific encounter with empirical reality, can assist in the creation of knowledge about the future, as shown in chapter 8. So what about working without paper flight strips? The research community has come to no consensus on whether air traffic control can actually do without them, and if it does, how it succeeds in keeping air-traffic under control. Research results are inconclusive. Some literature suggested that flight strips are expendable without consequences for safety (e.g., Albright, Truitt, Barile, Vortac, & Manning, 1996), whereas others argued that air-traffic control is basically impossible without them (e.g., Hughes, Randall, & Shapiro, 1993). Certification guidance that
  • 107. could be extracted from the research base can go either way: It is either safe or unsafe to do away with the flight strips, depending on whom you listen to. What matters most for credibility is whether the researcher can make statements about human work that a certifier can apply to the coming, future use of a system. In this, researchers appeared to rely on argument and rhetoric, as much as on method, to justify that the results they found are applicable to the future. LEIPZIG AS LEGITIMATE For human factors, the traditionally legitimate way of verifying the safety of new technology is to conduct experiments in the laboratory. Say that researchers want to test whether operators can safely use voice-input systems, or whether their interpretation of some target is better on three-dimensional displays. The typical strategy is to build microversions of the future system and expose a limited number of participants to various conditions, some or all of which may contain partial representations of a target system. Through its controlled settings, laboratory research already makes some sort of verifiable step into the future. Empirical contact with a world to be 175 WILL THE SYSTEM BE SAFE? designed is ensured because some version of that future world has been prefabricated in the lab. This also leads to problems. Experimental steps into the future are necessarily narrow, which affects the generalizability of research findings. The mapping between test and target situations may miss several important factors. In part as a result of a restricted integration of context, laboratory studies can yield divergent and eventually inconclusive results. Laboratory research on decision making (Sanders & McCormick, 1997), for example, has found several biases in how decision makers deal with information presented to them. Can new technology circumvent the detrimental aspects of such biases, which, according to some views, would lead to human error and safety problems? One bias is that humans are generally conservative and do not extract as much information from sources as they optimally should. Another bias, derived from the same experimental research, is that people have a tendency to seek far more information than they can absorb adequately. Such biases would seem to be in direct opposition to each other. It means that reliable predictions of human performance in a future system may be difficult to make on the basis of such research. Indeed, laboratory findings often come with qualifying labels that limit their applicability. Sanders and McCormick (1997), for example, advised: "When interpreting the . . . findings and conclusions, keep in mind that much of the literature is comprised of laboratory studies using young, healthy males doing relatively unmotivating tasks. The extent to which we can generalize to the general working population is open to question" (p. 572). Whether the question remains open does not seem to matter. Experimental human factors research in the laboratory holds a special appeal because it makes mind measurable, and it even allows mathematics to be applied to the results. Quantitativism is good: It helps equate psychology with natural science, shielding it from the unreliable wanderings through mental life using dubious methods like introspection. The large-scale university laboratories that are now a mainstay of many human factors departments were a 19th-century European invention, pioneered by scientists such as the chemist Justus Liebig. Wundt of course started the trend in psychology with his Leipzig laboratory (see chap. 5). Leipzig did psychologya great service: Psychophysics and its methods of inquiry introduced psychology as a serious science, as something realist, with numbers, calculations, and equations. The systematization, mechanization, and quantification of psychological research in Leipzig, however, must be seen as an antimovement against earlier introspection and rationalism. Echoes of Leipzig still sound loudly today. A quantitativist preference remains strong in human factors. Empiricist appeals (the pursuit of real measurable facts through experiment) and a strong reliance on Cartesian-Newtonian
  • 108. interpretations of natural science equal to those of, say, physics, may 176 CHAPTER 9 help human factors retain credibility in a world of constructed hardware and engineering science, where it alone dabbles in the fuzziness of psychology. In a way, then, quantitativist human factors or engineering psychology is still largely the sort of antimovement that Wundt formed with his Leipzig laboratory. It finds its expression in a pursuit of numbers and statistics, lest engineering consumers of the research results (and their government or other sponsors) suspect the results to be subjective and untrustworthy. The quantification and mechanization of mind and method in human factors are good only because they are not something else (i.e., foggy rationalism or unreliable introspection), not because they are inherently good or epistemologically automatically justifiable. The experimental method is good for what it is not, not for what it is. One can see this in the fact that quantitative research in mainstream human factors never has to justify its method (that method is good because at least it is not that other, vague stuff). Qualitative research, on the other hand, is routinely dismissed as insufficiently empirical and will always be required to justify its method. Anything perceived to be sliding toward rationalism, subjectivism, and nonsystematic introspection is highly suspicious, not because it is, but because of what it evokes: a fear that human factors will be branded unscientific. Now these fears are nothing new. They have inspired many a split or departure in the history of psychology. Recall Watson's main concern when launching behaviorism. It was to rescue psychology from vague subjectivist introspection (by which he even meant Wundt's systematic, experimental laboratory research) and plant it firmly within the natural science tradition. Ever since Newton read the riot act on what scientific was to be, psychology and human factors have struggled to find an acceptance and an acceptability within that conceptualization. Misconceptions About the Qualitative-Quantitative Relationship Whether quantitative or qualitative research can make more valid claims about the future (thereby helping in the certification of a system as safe to use) is contested. At first sight, qualitative, or field studies, are about the present (otherwise there is no field to study). Quantitative research may test actual future systems, but the setting is typically so contrived and limited that its relationship to a real future is tenuous. As many have pointed out, the difference between quantitative and qualitative research is actually not so great (e.g., Woods, 1993; Xiao & Vicente, 2000). Claims of epistemological privilege by either are counterproductive, and difficult to substantiate. A method becomes superior only if it better helps researchers answer the question they are pursuing, and in this sense, of course, the differences 177 WILL THE SYSTEM BE SAFE? between qualitative and quantitative research can be real. But dismissing qualitative work as subjective misses the point of quantitative work. Squeezing numbers out of an experimental encounter with reality, and then closing the gap to a concept-dependent conclusion on what you just saw, requires generous helpings of interpretation. As we see in the following discussion, there is a great deal of subjectivism in endowing numbers with meaning. Moreover, seeing qualitative inquiry as a mere protoscientific prelude to real quantitative research misconstrues the relationship and overestimates quantitative work. A common notion is that qualitative work should precede quantitative research by generating hypotheses that can then be tested in more restricted settings. This may be one relationship. But often quantitative work only reveals the how or what (or how much) of a particular phenomenon. Numbers in themselves can have a hard time revealing the why of the phenomenon. In this case, quantitative work is the prelude to real qualitative research: It is experimental number crunching that precedes and triggers the study of meaning. Finally, a common claim is that qualitative work is high in external validity
  • 109. and low in internal validity. Quantitative research, on the other hand, is thought to be low in external validity and high in internal validity. This is often used as justification for either approach and it must rank among the most misconstrued arguments in scientific method. The idea is that internal validity is high because experimental laboratory research allows an investigator almost full control over the conditions in which data are gathered. If the experimenter did not make it happen, either it did not happen, or the experimenter knows about it, so that it can be dealt with as a confound. But the degree of control in research is often overestimated. Laboratory settings are simply another kind of contextualized setting, in which all kinds of subtle influences (social expectations, people's life histories) enter and influence performance just like they would in any other contextualized setting. The degree of control in qualitative research, on the other hand, is often simply assumed to be low. And much qualitative work indeed adds to that image. But rigor and control is definitely possible in qualitative work: There are many ways in which a researcher can become confident about systematic relationships between different factors. Subjectivism in interpretation is not more necessary in qualitative than in quantitative research. Qualitative work, on the other hand, is not automatically externally valid simply because it takes place in a field (applied) setting. Each encounter with empirical reality, whether qualitative or quantitative, generates context-specific data—data from that time and place, from those people, in that language—that are by definition nonexportable to other settings. The researcher has to engage in analysis of those data in order to bring them up to a concept-dependent level, from which terms and conclusions can be taken to other settings. 178 CHAPTER 9 The examples that follow play out these issues. But the account is about more than the real or imagined opposition between qualitative and quantitative work. The question is how human factors research, quantitative or qualitative, can contribute to knowing whether a system will be safe to use. EXPERIMENTAL HUMAN FACTORS RESEARCH ON FLIGHT STRIPS: AN EXAMPLE One way to find out if controllers can control air-traffic without the aid of flight strips is to test it in an experimental setting. You take a limited number of controllers, and put them through a short range of tasks to see how they do. In their experiments, Albright et al. (1996) deployed a wide array of measurements to find out if controllers perform just as well in a condition with no strips as in a condition with strips. The work they performed was part of an effort by the U.S. Federal Aviation Administration, a regulator (and ultimately the certifier of any future air-traffic control system in the U.S.). In their study, the existing air-traffic control system was retained, but to compare stripped versus stripless control, the researchers removed the flight strips in one condition: The first set of measurements consisted of the following: total time watching the PVD [plan view display, or radar screen], number of FPR [flight plan requests], number of route displays, number of J-rings used, number of conflict alerts activated, mean time to grant pilot requests, number of unable requests, number of requests ignored, number of controller-to-pilot requests, number of controller-to-center requests, and total actions remaining to complete at the end of the scenario. (Albright et al., p. 6) The assumption that drives most experimental research is that reality (in this case about the use and usefulness of flight strips) is objective and that it can be discovered by the researcher wielding the right measuring instruments. This is consistent with the structuralism and realism of human factors. The more measurements, the better, the more numbers, the more you know. This is assumed to be valid even when an underlying model that would couple the various measurements together into a coherent account of expert performance is often lacking (as it is in Albright et al., 1996, but also in many folk models in human factors). In experimental work, the
  • 110. number and diversity of measurements can become the proxy indicator of the accuracy of the findings, and of the strength of the epistemological claim (Q: So how do you know what you know? A: Well, we measured this, and this, and this, and that, and . . .). The assumption is that, with enough quantifiable data, knowledge can eventually be offered that produces an ac- 179 WILL THE SYSTEM BE SAFE? curate and definitive account of a particular system. More of the same will eventually lead to something different. The strong influence that engineering has had on human factors (Batteau, 2001) makes this appear as just common sense. In engineering, technical debates are closed by amassing results from tests and experience; the essence of the craft is to convert uncertainty into certainty. Degrees of freedom are closed through numbers; ambiguity is worked out through numbers; uncertainty is reduced through numbers (Vaughan, 1996). Independent of the number of measurements, each empirical encounter is of necessity limited, in both place and time. In the case of Albright et al. (1996), 20 air-traffic controllers participated in two simulated airspace conditions (one with strips and one without strips) for 25 minutes each. One of the results was that controllers took longer to grant pilot requests when they did not have access to flight strips, presumably because they had to assemble the basis for a decision on the request from other information sources. The finding is anomalous compared to other results, which showed no significant difference between workload and ability to keep control over the traffic situation across the strip-no strip conditions, leading to the conclusion that "the presence or absence of strips had no effect on either performance or perceived workload. Apparently, the compensatory behaviors were sufficient to maintain effective control at what controllers perceived to be a comparable workload" (Albright et al., p. 11). Albright et al. explained the anomaly as follows: "Since the scenarios were only 25 minutes in length, controllers may not have had the opportunity to formulate strategies about how to work without flight strips, possibly contributing to the delay" (p. 11). At a different level, this explanation of an anomalous datum implies that the correspondence between the experimental setting and a future system and setting may be weak. Lacking a real chance to learn how to formulate strategies for controlling traffic without flight strips, it would be interesting to pursue the question of how controllers in fact remained in control over the traffic situation and kept their workload down. It is not clear how this lack of a developed strategy can affect the number of requests granted but not the perceived workload or control performance. Certifiers may, or perhaps should, wonder what 25 minutes of undocumented struggle tells them about a future system that will replace decades of accumulated practice. The emergence of new work and establishment of new strategies is a fundamental accompaniment to the introduction of new technology, representing a transformation of tasks, roles, and responsibilities. These shifts are not something that could easily be noticed within the confines of an experimental study, even if controllers were studied for much longer than 25 minutes. Albright et al. (1996), resolved this by placing the findings of control performance and workload earlier in their text: "Neither performance nor 180 CHAPTER9 perceived workload (as we measured them in this study) was affected when the strips were removed" (p. 8). The qualification that pulled the authority of the results back into the limited time and place of the experimental encounter (how we measured them in this study), were presented parenthetically and thus accorded less central importance (Golden-Biddle & Locke, 1993). The resulting qualification suggests that comparable performance and workload may be mere artifacts of the way the study was conducted, of how these things were measured at that time and place, with those tools, by those researchers. The qualification, however, was in the middle of the paper, in the middle of a paragraph, and surrounded by other paragraphs
  • 111. adorned with statistical allusions. Nothing of the qualification remained at the end of the paper, where the conclusions presented these localized findings as universally applicable truths. Rhetoric, in other words, is enlisted to deal with problematic areas of epistemological substance. The transition from localized findings (in this study the researchers found no difference in workload or performance the way they measured them with these 20 controllers) to generalizable principles (we can do away with flight strips) essentially represents a leap of faith. As such, central points of the argument were left unsaid or were difficult for the reader to track, follow, or verify. By bracketing doubt this way, Albright et al. (1996) communicated that there was nothing, really, to doubt. Authority (i.e., true or accurate knowledge) derives from the replicable, quantifiable experimental approach. As Xiao and Vicente (2000) argued, it is very common for quantitative human factors research not to spend much time on the epistemological foundation of its work. Most often it moves unreflectively from a particular context (e.g., an experiment) to concepts (not having strips is safe), from data to conclusions, or from the modeled to the model. The ultimate resolution of the fundamental constraint on empirical work (i.e., each empirical encounter is limited to a time and place) is that more research is always necessary. This is regarded as a highly reasonable conclusion of most quantitative human factors, or indeed any, experimental work. For example, in the Albright et al. study, one constraint was the 25-minute time limit on the scenarios played. Does flight-strip removal actually change controller strategies in ways that were not captured by the present study? This would seem to be a key question. But again, the reservation was bracketed. Whether or not the study answered this question does not in the end weaken the study's main conclusion: "(Additional research is necessary to determine if there are more substantial long term effects to strip removal)" (p. 12). In addition, the empirical encounter of the Albright et al. (1996) study was limited because it only explored one group of controllers (upper airspace). The argument for more research was drafted into service for legitimizing (not calling into question) results of the study: "Additional studies 181 WILL THE SYSTEM BE SAFE? should be conducted with field controllers responsible for other types of sectors (e.g., low altitude arrival, or non-radar) to determine when, or if, controllers can compensate as successfully as they were able to in the current investigation" (p. 12). The idea is that more of the same, eventually, will lead to something different, that a series of similar studies over time will produce a knowledge increment useful to the literature and useful to the consumers of the research (certifiers in this case). This, once again, is largely taken for granted in the human factors community. Findings will invariably get better next time, and such successive, incremental enhancement is a legitimate route to the logical human factors end point: the discovery of an objective truth about a particular human-machine system and, through this, the revelation of whether it will be safe to use or not. Experimental work relies on the production of quantifiable data. Some of this quantification (with statistical ornaments such as F-values and standard deviations) was achieved in Albright et al. (1996) by converting tickmarks on lines of a questionnaire (called the "PEQ," or post-experimental questionnaire) into an ordinal series of digits: The form listed all factors with a 9.6 centimeter horizontal line next to each. The line was marked low on the left end and high on the right end. In addition, a vertical mark in the center of the line signified the halfway mark. The controllers were instructed to place an X on the line adjacent to the factor to indicate a response. . . . The PEQ scales were scored by measuring distance from the right anchor to the mark placed by the controller on a horizontal line (in centimeters). . . . Individual repeated measures ANOVAs [were then conducted], (pp. 5-8) The veneration of numbers in this case, however, went a step too far.
  • 112. ANOVAs cannot be used for the kind of data gathered through PEQ scales. The PEQ is made up of so-called ordinal scales. In ordinal scales, data categories are mutually exclusive (a tickmark cannot be at two distances at the same time), they have some logical order, and they are scored according to the amount of a particular characteristic they possess (in this case, distance in centimeters from the left anchor). Ordinal scales, however, do not represent equal differences (a distance of 2 cm does not represent twice as much of the category measured as a distance of 1 cm), as interval and ratio scales do. Besides, reducing complex categories such as "usefulness" or "likeability" to distances along a few lines probably misses out on an interesting ideographic reality beneath all of the tickmarks. Put in experimental terms, the operationalization of usefulness as the distance from a tickmark along a line is not particularly high on internal validity. How can the researcher be sure that usefulness means the same thing to all responding controllers? If different respondents have different ideas of what usefulness meant during 182 CHAPTER9 their particular experimental scenario, and if different respondents have different ideas of how much usefulness a tickmark, say, in the middle of the line represents, then the whole affair is deeply confounded. Researchers do not know what they are asking and do not know what they are getting in reply. Further numeric analysis is dealing with apples and oranges. This is one of the greater risks of folk modeling in human factors. It assumes that everybody understands what usefulness means, and that everybody has the same definition. But these are generous and untested assumptions. It was only with qualitative inquiry that researchers could ensure that there was some consensus on understandings of usefulness with respect to the controlling task with or without strips. Or they could discover that there was no consensus and then control for it. This would be one way to deal with the confound. It may not matter, and it may not have been noticed. Numbers are good. Also, the linear, predictable format of research writing, as well as the use of abbreviated statistical curios throughout the results section, represent a rhetoric that endows the experimental approach with its authority—authority in the sense of privileged access to a particular layer or slice of empirical reality that others outside the laboratory setting do or do not have admittance to. Other rhetoric invented particularly for the study (e.g., PEQ scales for questions presented to participants after their trials in Albright et al., 1996) certifies the researchers' unique knowledge of this slice of reality. It validates the researcher's competence to tell readers what is really going on there. It may dissuade second-guessing. Empirical results are deemed accurate by virtue of a controlled encounter, a standard reporting format that shows logical progress to objective truths and statements (introduction, method, results, discussion, and summary), and an authoritative dialect intelligible only to certified insiders. Closing the Gap to the Future Because of some limited correspondence between the experiment and the system to be designed, quantitative research seemingly automatically closes the gap to the future. The stripless condition in the research (even if contrived by simply leaving out one artifact [the flight strip] from the present) is a model of the future. It is an impoverished model to be sure, and one that offers only a partial window onto what future practice and performance may be like (despite the epistemological reservations about the authenticity of that future discussed earlier). The message from Albright et al.'s (1996) encounter with the future is that controllers can compensate for the lack of flight strips. Take flight strips away, and controllers compensate for the lack of information by seeking information elsewhere (the ra- 183 WILL THE SYSTEM BE SAFE? dar screen, flight-plan readouts, controller-to-pilot requests). Someone might point out that Albright et al. prejudged the use and usefulness of flight strips in the first few sentences of their introduction, that they did not
  • 113. see their data as an opportunity to seek alternative interpretations: "Currently, en route control of high altitude flights between airports depends on two primary tools: the computer-augmented radar information available on the Plan View Display (PVD) and the flight information available on the Flight Progress Strip" (p. 1). This is not really an enabling of knowledge, it is the imposition of it. Here, flight strips are not seen as a problematic core category of controller work, whose use and usefulness would be open to negotiation, disagreement, or multiple interpretations. Instead, flight strips function as information-retrieval devices. Framed as such, the data and the argument can really only go one way: By removing one source of information, controllers will redirect their information-retrieving strategies onto other devices and sources. This displacement is possible, it may even be desirable, and it is probably safe: "Complete removal of the strip information and its accompanying strip marking responsibilities resulted in controllers compensating by retrieving information from the computer" (Albright et al., p. 11). For a certifier, this closes a gap to the future: Removing one source of information will result in people finding the information elsewhere (while showing no decrement in performance or increment in workload). The road to automation is open and people will adapt successfully, for that has been scientifically proven. Therefore, doing away with the flight strips is (probably) safe, and certifiable as such. If flight strips are removed, then what other sources of information should remain available? Albright et al. (1996) inquired about what kind of information controllers would minimally like to preserve: Route of flight scored high, as did altitude information and aircraft call sign. Naming these categories gives developers the opportunity to envision an automated version of the flight strip that presents the same data in digital format, one that substitutes a computer-based format for the paper-based one, without any consequences for controller performance. Such a substitution, however, may overlook critical factors associated with flight strips that contribute to safe practice, and that would not be incorporated or possible in a computerized version (Mackay, 2000). Any signs of potential ambiguity or ambivalence about what else flight strips may mean to those working with them were not given further consideration beyond a brief mention in the experimental research write-up—not because these signs were actively, consciously stifled, but because they were inevitably deleted as Albright et al. (1996) carried out and wrote up their encounter with empirical reality. Albright et al. explicitly solicited qualitative, richer data from their participants by asking if controllers themselves felt that the lack of strips impaired their performance. Various controllers 184 CHAPTER 9 indicated how strips help them preplan and that, without strips, they cannot preplan. The researchers, however, never unpacked the notion of preplanning or investigated the role of flight strips in it. Again, such notions (e.g., preplanning) are assumed to speak for themselves, taken to be selfevident. They require no deconstruction, no further interpretive work. Paying more attention to these qualitative responses could create noise that confounds experimental accuracy. Comments that preplanning without strips was impossible hinted at flight strips as a deeper, problematic category of controller work. But if strips mean different things to different controllers, or worse, if preplanning with strips means different things to different controllers, then the experimental bedrock of comparing comparable people across comparable conditions would disappear. This challenges in a profound way the nomothetic averaging out of individual differences. Where individual differences are the nemesis of experimental research, interpretive ambiguity can call into question the legitimacy of the objective scientific enterprise. QUALITATIVE RESEARCH ON FLIGHT STRIPS: AN EXAMPLE Rather than looking at people's work from the outside in (as do quantitative
  • 114. experiments), qualitative research tries to understand people's work from the inside out. When taking the perspective of the one doing the work, how does the world look through his or her eyes? What role do tools play for people themselves in the accomplishments of their tasks; how do tools affect their expression of expertise? An interpretive perspective is based on the assumption that people give meaning to their work and that they can express those meanings through language and action. Qualitative research interprets the ways in which people make sense of their work experiences by examining the meanings that people use and construct in light of their situation (Golden-Biddle & Locke, 1993). The criteria and end points for good qualitative research are different than from those in quantitative research. As a research goal, accuracy is practically and theoretically unobtainable. Qualitative research is relentlessly empirical, but it rarely achieves finality in its findings. Not that quantitative research ever achieves finality (remember that virtually every experimental report finishes with the exhortation that more research is necessary). But qualitative researchers admit that there is never one accurate description or analysis of a system in question, no definitive account— only versions. What flight strips exactly do for controllers is forever subject to interpretation; it will never be answered objectively or finitely, never be closed to further inquiry. What makes a version good, though, or credible, 185 WILL THE SYSTEM BE SAFE? or worth paying attention to by a certifier, is its authenticity. The researcher has to not only convince the certifier of a genuine field experience in writing up the research account, but also make intelligible what went on there. Validation from outside the field emerges from an engagement with the literature (What have others said about similar contexts?) and from interpretation (How well are theory and evidence used to make sense of this particular context?). Field research, though critical to the ethnographic community as a stamp of authenticity, is not necessarily the only legitimate way to generate qualitative data. Surveys of user populations can also be tools that support qualitative inquiry. Find Out What the Users Think The reason that qualitative research may appeal to certifiers is that it lets the informants, the users, speak—not through the lens of an experiment, but on the users' terms and initiative. Yet this is also where a central problem lies. Simply letting users speak can be of little use. Qualitative research is not (or should not be) plain conversational mappings—a direct transfer from field setting to research account. If human factors would (or continues to) practice and think about ethnography in these terms, doubts about both the method and the data it yields will continue to surface. What certifiers, as consumers of human factors research, care about is not what users say in raw, unpacked form, but about what their remarks mean for work, and especially for future work. As Hughes et al. (1993) put it: "It is not that users cannot talk about what it is they know, how things are done, but it needs bringing out and directing toward the concerns of the design itself" (p. 138). Within the human factors community, qualitative research seldom takes this extra step. What human factors requires is a strong ethnography, one that actually makes the hard analytical move from user statements to a design language targeted at the future. A massive qualitative undertaking related to flight strips was the Lancaster University project (Hughes et al., 1993). Many man-months were spent (an index of the authenticity of the research) observing and documenting air-traffic control with flight strips. During this time the researchers developed an understanding of flight strips as an artifact whose functions derive from the controlling work itself. Both information and annotations on the strip and the active organization of strips among and between controllers were essential: "The strip is a public document for the members of the (controlling) team; a working representation of an aircraft's control history and a work site of controlling. Moving the strips is to organize the information
  • 115. in terms of work activities and, through this, accomplishing the work of organizing the traffic" (Hughes et al., pp. 132-133). Terms such as working representation and organizing traffic are concepts, or categories, 186 CHAPTER 9 that were abstracted well away from the masses of deeply context-specific field notes and observations gathered in the months of research. Few controllers would themselves use the term working representation to explain what flight strips mean to them. This is good. Conceptual abstraction allows a researcher to reach a level of greater generality and increased generalizability (see Woods, 1993; Xiao & Vicente, 2000). Indeed, working representation may be a category that can lead to the future, where a designer would be looking to computerize a working representation of flight information, and a certifier would be evaluating whether such a computerized tool is safe to use. But such higher order interpretive work is seldom found in human factors research. It would separate ethnography and ethnographic argument from research that simply makes claims based on authenticity. Even Hughes et al. (1993) relied on authenticity alone when they told of the various annotations made on flight strips, and did little more than parrot their informants: Amendments may be done by the controller, by the chief, or less often, by one of the "wings." "Attention-getting" information may also be written on the strips, such as arrows indicating unusual routes, symbols designating "crossers, joiners and leavers" (that is, aircraft crossing, leaving or joining the major traffic streams), circles around unusual destinations, and so on. (p. 132) Though serving as evidence of socialization, of familiarity and intimacy, speaking insider language is not enough. By itself it is not helpful to certifiers who may be struggling with evaluating a version of air-traffic control without paper flight strips. Appeals to authenticity ("Look, I was there, and I understand what the users say") and appeals to future relevance ("Look, this is what you should pay attention to in the future system") can thus pull in opposite directions: the former toward more the context specific that is hardly generalizable, the latter toward abstracted categories of work that can be mapped onto yet-to-be-fielded future systems and conceptions of work. The burden to resolve the tension should not be on the certifier or the designer of the system, it should be on the researcher. Hughes et al. (1993) agreed that this bridge-building role should be the researcher's: Ethnography can serve as another bridge between the users and the designers. In our case, controllers have advised on the design of the display tool with the ethnographer, as someone knowledgeable about but distanced from the work, and, on the one hand able to appreciate the significance of the controllers' remarks for their design implications and, on the other hand, familiar enough with the design problems to relate them to the controllers' experiences and comments, (p. 138) 187 WILL THE SYSTEM BE SAFE? Hostage to the Present, Mute About the Future Hughes et al. (1993) research account actually missed the "significance of controller remarks for their design implications" (p. 138). No safety implications were extracted. Instead the researchers used insider language to forward insider opinions, leaving user statements unpacked and largely underanalyzed. Ethnography essentially gets confused with what informants say and consumers of the research are left to pick and choose among the statements. This is a particularly naive form of ethnography, where what informants can tell researchers is equated or confused with what strong, analytical ethnography (and ethnographic argument) could reveal. Hughes et al. relied on informant statements to the extent they did because of a common belief that the work that their informants did, and the foundational categories that informed it are for the most part self-evident; close to what we would regard as common sense. As such, they require little, if any, analytic effort to discover. It is an ethnography reduced to a kind of mediated
  • 116. user show-and-tell for certifiers—not as thorough analysis of the foundational categories of work. For example, Hughes et al concluded that "(flight strips) are an essential feature of 'getting the picture,' 'organising the traffic,' which is the means of achieving the orderliness of the traffic" (p. 133). So flight strips help controllers get the picture. This kind of statement is obvious to controllers and merely repeats what everyone already knows. If ethnographic analysis cannot go beyond common sense, it merely privileges the status quo. As such, it offers certifiers no way out: A system without flight strips would not be safe, so forget it. There is no way for a certifier to circumvent the logical conclusion of Hughes et al. (1993): "The importance of the strip to the controlling process is difficult to overestimate" (p. 133). So is it safe? Going back to Hughes et al.: "For us, such questions were not easily answerable by reference to work which is as subtle and complex as our ethnographic analysis had shown controlling to be" (p. 135). Such surrender to the complexity and intricacy of a particular phenomenon is consistent with what Dawkins (1986, p. 38) called the "argument from personal incredulity." When faced with highly complicated machinery or phenomena, it is easy to take cover behind our own sense of extreme wonder, and resist efforts at explanation. In the case of Hughes et al. (1993), it recalls an earlier reservation: "The rich, highly detailed, highly textured, but nevertheless partial and selective descriptions associated with ethnography would seem to contribute little to resolving the designers problem where the objective is to determine what should be designed and how" (p. 127). Such justification ("It really is too complex and subtle to communicate to you") maneuvers the entire ethnographic enterprise out of the certifier's 188 CHAPTER 9 view as something not particularly helpful. Synthesizing the complexity and subtlety of a setting should not be the burden of the certifier. Instead, this is the role of the researcher; it is the essence of strong ethnography. That a phenomenon is remarkable does not mean it is inexplicable; so if we are unable to explain it, "we should hesitate to draw any grandiose conclusions from the fact of our own inability" (Dawkins, 1986, p. 39). Informant remarks such as "Flight strips help me get the mental picture" should serve as a starting point for qualitative research, not as its conclusion. But how can researchers move from native category to analytic sense? Qualitative work should be hermeneutic and circular in nature: not aiming for a definitive description of the target system, but rather a continuous reinterpretation and reproblematization of the successive layers of data mined from the field. Data demand analysis. Analysis in turn guides the search for more data, which in turn demand further analysis: Categories are continually revised to capture the researcher's (and, hand in hand, the practitioner's) evolving understanding of work. There is a constant interplay between data, concepts, and theory. The analysis and revision of categories is a hallmark of strong ethnography, and Ross's (1995) study of flight-progress strips in Australia serves as an interesting example. Qualitative in nature, Ross's research relied on surveys of controllers using flight strips in their current work. Surveys are often derided by qualitative researchers for imposing the researcher's understanding of the work onto the data, instead of the other way around (Hughes et al., 1993). Demonstrating that it is not just the empirical encounter or rhetorical appeals to authenticity that matter (through large numbers of experimental probes or months of close observation), the survey results Ross gathered were analyzed, coded, categorized, receded and recategorized until the inchoate masses of context-specific controller remarks began to form sensible, generalizable wholes that could meaningfully speak to certifiers. Following previous categorizations of flight-strip work (Delia Rocco, Manning, & Wing, 1990), Ross (1995) moves down from these conceptual
  • 117. descriptions of controller work and up again from the context-specific details, leaving several layers of intermediate steps. In line with characterizations of epistemological analysis through abstraction hierarchies (see Xiao & Vicente, 2000), each step from the bottom up is more abstract than the previous one; each is cast less in domain-bound terms and more in concept dependent terms than the one before. Beyer and Holtzblatt (1998) referred to this process as induction: reasoning from the particular to the general. One example from Ross (p. 27) concerns domain-specific controller activities such as "entering a pilot report; composing a flight plan amendment." These lower level, context-specific data are of course not without semantic load themselves: it is always possible to ask further questions and 189 WILL THE SYSTEM BE SAFE? descend deeper into the world of meanings that these simple, routine activities have for the people who carry them out. Indeed, we have to ask if we can only go up from the context-specific level—maintained in human factors as the most atomistic, basic, low-level data set (see Woods, 1993). In Ross's data, researchers should still question the common sense behind the otherwise taken-for-granted entering of a pilot report: What does a pilot report mean for the controller in a particular context (e.g., weather related), what does entering this report mean for the controller's ability to manage other traffic issues in the near future (e.g., avoiding sending aircraft into severe turbulence)? While alluding to even more fine-grained details and questions later, these types of activities also point to an intentional strategy at a higher level of analysis (Delia Rocco et al., 1990): that of the "transformation or translation of information for entry into the system," which, at an even higher level of analysis, could be grouped under a label coding, together with other such strategies (Ross, 1995, p. 27). Part of this coding is symbolic, in that it uses highly condensed markings on flight strips (underlining, black circles, strike-throughs) to denote and represent for controllers, what is going on. The highly intricate nature of even one flight (where it crosses vs. where it had planned to cross a sector boundary, what height it will be leaving when, whether it has yet contacted another frequency, etc.) can be collapsed or amortized by simple symbolic notation—one line or circle around a code on the strip that stands for a complex, multidimensional problematic that other controllers can easily recognize. Unable to keep all the details of what a flight would do stable in the head, the controller compresses complexity, or amortizes it, as Hollan, Hutchins, and Kirsh (2000) would say, by letting one symbol stand for complex concepts and interrelationships, some even temporal. Similarly, "recognizing a symbol for a handoff' (on a flight strip), though allowing further unpacking (e.g., what do you mean "recognize"?), is an instance of a tactic that "transforms or translates information received," which in turn represents a larger controller competency of "decoding," which in its turn is also part of a strategy to use symbolic notation to collapse or amortize complexity (Ross, 1995, p. 27). From recognizing a symbol for a hand-off to the collapsing of complexity, there are four steps, each more abstract and less in domain terms than the one before. Not only do these steps allow others to assess the analytical work for its worth, but the destination of such induction is actually a description of work that can be used for guiding the evaluation of a future system. Inspired by Ross's analysis, we can surmise that controllers rely on flight strips for: • Amortizing or collapsing complexity (what symbolic notation conveys) . 190 CHAPTER 9 • Supporting coordination (who gets which flight strip next from whom). • Anticipating dynamics (how much is to come, from where, when, in what order). These (no longer so large) jumps to the highest level of abstraction can
  • 118. now be made—identifying the role the flight strip has in making sense of workplace and task complexity. Although not so much a leap of faith any longer (because there are various layers of abstraction in between), the final step, up to the highest level conceptual description, still appears to hold a certain amount of creative magic. Ross (1995) revealed little of the mechanisms that actually drive his analysis. There is no extensive record that tracks the transformation of survey data into conceptual understandings of work. Perhaps these transformations are taken for granted too: The mystery is left unpacked because it is assumed to be no mystery. The very process by which the researcher manages to migrate from user-language descriptions of daily activities to conceptual languages less anchored in the present, remains largely hidden from view. No ethnographic literature guides specifically the kinds of inferences that can be drawn up to the highest level of conceptual understanding. At this point, a lot of leeway is given (and reliance placed on) the researcher and his or her (keenly) developed insight into what activities in the field really mean or do for people who carry them out. The problems of this final step are known and acknowledged in the qualitative research community. Vaughan (1996) and other sociologists referred to it as making the macro-micro connection: locating general meaning systems (e.g., symbolic notation, off-loading) in local contexts (placing a circle around a set of digits on the flight strip). Geertz (1973) noted how inferences that try to make the macro-micro connection often resemble "perfected impressionism" in which "much has been touched but little grasped" (p. 312). Such inferences tend to be evocative, resting on suggestion and insinuation more than on analysis (Vaughan, 1996). In qualitative research, lower levels of analysis or understanding always underconstrain the inferences that can be drawn further on the way to higher levels (see Hollan et al., 2000). At each step, alternative interpretations are possible. Qualitative work does not arrive at a finite description of the system or phenomenon studied (nor does quantitative research, really). But qualitative work does not even aim or pretend to do so (Batteau, 2001). Results are forever open to further interpretation, forever subject to increased problematization. The main criterion, therefore, to which we should hold the inferences drawn is not accuracy (Golden-Biddle & Locke, 1993), but plausibility: Does the conceptual description make sense—especially to the informants, to the people who actually do the work? This also motivates the continuous, circular nature of qualitative analysis: reinter- 191 WILL THE SYSTEM BE SAFE? preting results that have been interpreted once already, gradually developing a theory—a theory of why flight strips help controllers know what is going on that is anchored in the researcher's continually evolving understanding of the informants' work and their world. Closing the Gap to the Future The three high-level categories of controller (flight-strip) work tell certifiers that air-traffic controllers have developed strategies for dealing with the communication of complexity to other controllers, for predicting workload and planning future work. Flight strips play a central, but not necessarily exclusive, role. The research account is written up in such a way that the status quo does not get the prerogative: Tools other than flight strips could conceivably help controllers deal with complexity, dynamics, and coordination issues. Complexity and dynamics, as well as coordination, are critical to what makes air-traffic control what it is, including difficult. Whatever certifiers will want to brand as safe to use, they would do well to take into account that controllers use their artifact(s) to help them deal with complexity, to help them anticipate dynamic futures, and to support their coordination with other controllers. This resembles some kind of human factors requirements that could provide a certifier with meaningful input. CERTIFYING UNDER UNCERTAINTY One role of human factors is to help developers and certifiers judge whether a technology is safe for future use. But quantitative and qualitative
  • 119. human factors communities both risk taking the authority of their findings for granted and regarding the translation to future, and claims about the future being either safe or unsafe, as essentially nonproblematic. At least the literature (both literatures) are relatively silent on this fundamental issue. Yet neither the legitimacy of findings nor the translation to claims about the future is in fact easily achieved, or should be taken for granted. More work needs to be done to produce findings that make sense for those who have to certify a system as safe to use. Experimental human factors research can claim empirical legitimacy by virtue of the authority vested in the laboratory researcher and the control over the method used to get data. Such research can speak meaningfully to future use because it tests microversions of a future system. Researchers, however, should explicitly indicate where the versions of the future they tested are impoverished, and what subtle effects of context on their experimental settings could produce findings that diverge from what future users will encounter. 192 CHAPTER 9 Qualitative research in human factors can claim legitimacy, and relevance to those who need to certify the next system, because of its authentic encounters with the field where people actually carry out the work. Validation emerges from the literature (what others have said about the same and similar contexts) and from interpretation (how theory and evidence make sense of this particular context). Such research can speak meaningfully to certification issues because it allows users to express their preferences, choices and apprehensions. Qualitative human factors research, however, must not stop at recording and replaying informant statements. It must deconfound informant understandings with understandings informed by concepts, theory, analysis and literature. Human factors work, of whatever kind, can help bridge the gap from research findings to future systems. Research accounts need to be both convincing as science and cast in a language that allows a certifier to look ahead to the future: looking ahead to work and a co-evolution of people and technology in a system that does not yet exist. Chapter 10 Should We Hold People Accountable for Their Mistakes? Transportation human factors has made enormous progress over the past decades. It would be easy to claim that transportation systems have become safer in part through human factors efforts. As a result of such work over the past decades, progress on safety has become synonymous with: • Taking a systems perspective: Accidents are not caused by failures of individuals, but emerge from the conflux or alignment of multiple contributory system factors, each necessary and only jointly sufficient. The source of accidents is the system, not its component parts. • Moving beyond blame: Blame focuses on the supposed defects of individual operators and denies the import of systemic contributions. In addition, blame has all kinds of negative side effects. It typically leads to defensive posturing, obfuscation of information, protectionism, polarization, and mute reporting systems. Progress on safety coincides with learning from failure. This makes punishment and learning two mutually exclusive activities: Organizations can either learn from an accident or punish the individuals involved in it, but hardly do both at the same time. The reason is that punishment of individuals can protect false beliefs about basically safe systems, where humans are the least reliable components. Learning challenges and potentially changes the belief about what creates safety. Moreover, punishment emphasizes that failures are deviant, that they do not naturally belong in the organization. Learning means that failures are seen as normal, as resulting from the in193 194 CHAPTER 10 herent pursuit of success in resource-constrained, uncertain environments. Punishment turns the culprits into unique and necessary ingredients for
  • 120. the failure to happen. Punishment, rather than helping people avoid or better manage conditions that are conducive to error, actually conditions people not to get caught when errors do occur. This stifles learning. Finally, punishment is about the search for closure, about moving beyond and away from the adverse event. Learning is about continuous improvement, about closely integrating the event in what the system knows about itself. Making these ideas stick, however, is not proving as easy as it was to develop them. In the aftermath of several recent accidents and incidents, the operators involved (pilots or air-traffic controllers in these cases) were charged with criminal offenses (e.g., professional negligence, manslaughter). In some accidents even organizational management has been held criminally liable. Criminal charges differ from civil lawsuits in many respects. Most obviously, the target is not an organization, but individuals (air-traffic controllers, flight crew, maintenance technicians). Punishment consists of possible incarceration or some putatively rehabilitative alternative —not (just) financial compensation. Unlike organizations covered against civil suits, few operators or managers themselves have insurance to pay for legal defense against criminal charges that arise from doing their jobs. Some maintain that criminally pursuing operators or managers for erring on the job is morally unproblematic. The greater good befalls the greater number of people (i.e., all potential passengers) by protecting them from unreliable operators. A lot of people win, only a few outcasts lose. To human factors, however, this may be utilitarianism inverted. Everybody loses when human error gets criminalized: Upon the threat of criminal charges, operators stop sending in safety-related information; incident reporting grinds to a halt. Criminal charges against individual operators also polarize industrial relations. If the organization wants to limit civil liability, then official blame on the operator could deflect attention from upstream organizational issues related to training, management, supervision, and design decisions. Blaming such organizational issues, in contrast, can be a powerful ingredient in an individual operator's criminal defense—certainly when the organization has already rendered the operator expendable by euphemism (standby, ground duty, administrative leave) and without legitimate hope of meaningful re-employment. In both cases, industrial relations are destabilized. Intra-organizational battles become even more complex when individual managers get criminally pursued; defensive maneuvering by these managers typically aims to off-load the burden of blame onto other departments or parts of the organization. This easily leads to poisonous relations and a crippling of organizational functioning. Finally, incarceration or alternative punishment of operators or managers has no 195 HOLDING PEOPLE ACCOUNTABLE demonstrable rehabilitative effect (perhaps because there is nothing to rehabilitate) . It does not make an operator or manager any safer, nor is there evidence of vicarious learning (learning by example and fear of punishment) . Instead, punishment or its threat merely leads to counterproductive responses, to people ducking the debris. The transportation industry itself shows ambiguity with regard to the criminalization of error. Responding to the 1996 Valujet accident, where mechanics loaded oxygen generators into the cargo hold of a DC-9, which subsequently caught fire, the editor of Aviation Week and Space Technology "strongly believed the failure of SabreTech employees to put caps on oxygen generators constituted willful negligence that led to the killing of 110 passengers and crew. Prosecutors were right to bring charges. There has to be some fear that not doing one's job correctly could lead to prosecution" (North, 2000, p. 66). Rescinding this 2 years later, however, North (2002) opined that learning from accidents and criminal prosecution go together like "oil and water, cats and dogs," that "criminal probes do not mix well with aviation accident inquiries" (p. 70). Most other cases reveal similar instability with regard to prosecuting operators for error. Culpability in aviation
  • 121. does not appear to be a fixed notion, connected unequivocally to features of some incident or accident. Rather, culpability is a highly flexible category. Culpability is negotiable, subject to national and professional interpretations, influenced by political imperatives and organizational pressures, and part of personal or institutional histories. As psychologists point out, culpability is also about assumptions we make about the amount of control people had when carrying out their (now) controversial acts. The problem here is that hindsight deeply confounds such judgments of control. In hindsight, it may seem obvious that people had all the necessary data available to them (and thus the potential for control and safe outcomes). Yet they may have willingly ignored this data in order to get home faster, or because they were complacent. Retrospect and the knowledge of outcome deeply affect our ability to judge human performance, and a reliance on folk models of phenomena like complacency, situation awareness, and stress does not help. All too quickly we come to the conclusion that people could have better controlled the outcome of a situation, if only they had invested a little more effort. ACCOUNTABILITY What is accountability, and what does it actually mean to hold people accountable for their mistakes? Social cognition research shows that accountability or holding people accountable is not that simple. Accountability is fundamental to any social relation. There is always an implicit or explicit ex- 196 CHAPTER 10 pectation that we may be called on to justify our beliefs and actions to others. The social-functionalist argument for accountability is that this expectation is mutual: As social beings we are locked into reciprocating relationships. Accountability, however, is not a unitary concept—even if this is what many stakeholders may think when aiming to improve people's performance under the banner of holding them accountable. There are as many types of accountability as there are distinct relationships among people, and between people and organizations, and only highly specialized subtypes of accountability actually compel people to expend more cognitive effort. Expending greater effort, moreover, does not necessarily mean better task performance, as operators may become concerned more with limiting exposure and liability than with performing well (Lerner & Tetlock, 1999), something that can be observed in the decline of incident reporting with threats of prosecution (North, 2002). What is more, if accounting is perceived as illegitimate, for example, intrusive, insulting, or ignorant of real work, then any beneficial effects of accountability will vanish or backfire. Effects that have been experimentally demonstrated include a decline in motivation, excessive stress, and attitude polarization, and the same effects can be seen in recent cases where pilots and air-traffic controllers were held accountable by courts and other parties ignorant of the real trade-offs and dilemmas that make up actual operational work. The research base on social cognition, then, tells us that accountability, even if inherent in human relationships, is not unambiguous or unproblematic. The good side of this is that, if accountability can take many forms, then alternative, perhaps more productive avenues of holding people accountable are possible. Giving an account, after all, does not have to mean exposing oneself to liability, but rather, telling one's story so that others can learn vicariously. Many sources, even within human factors, point to the value of storytelling in preparing operators for complex, dynamic situations in which not everything can be anticipated. Stories are easily remembered, scenario-based plots with actors, intentions, clues, and outcomes that in one way or another can be mapped onto current difficult situations and matched for possible ways out. Incident-reporting systems can capitalize on this possibility, whereas more incriminating forms of accountability actually retard this very quality by robbing from people the incentive to tell stories in the first place. ANTHROPOLOGICAL UNDERSTANDINGS OF BLAME
  • 122. The anthropologist is not intrigued by flaws in people's reasoning process that produce for example, the hindsight bias, but wants to know something about casting blame. Why is blame a meaningful response for those doing 197 HOLDING PEOPLE ACCOUNTABLE the blaming? Why do we turn error into crime? Mary Douglas (1992) described how peoples are organized in part by the way in which they explain misfortune and subsequently pursue retribution or dispense justice. Societies tend to rely on one dominant model of possible cause from which they construct a plausible explanation. In the moralistic model, for example, misfortune is seen as the result of offending ancestors, of sinning, or of breaking some taboo. The inflated, exaggerated role that procedure violations (one type of sinning or taboo breaking) are given in retrospective accounts of failure represent one such use for moralistic models of breakdown and blame. The moralistic explanation (you broke the rule, then you had an accident) is followed by a fixed repertoire of obligatory actions that follow on that choice. If taboos have been broken, then rehabilitation can be demanded through expiatory actions. Garnering forgiveness through some purification ritual is one example. Forcing operators to publicly offer their apologies is a purification ritual seen in the wake of some accidents. Moreover, the rest of the community is reminded to not sin, to not break the taboos, lest the same fate befall them. How many reminders are there in the transportation industry imploring operators to "follow the rules," to "follow the procedures"? These are moralistic appeals with little demonstrable effect on practice, but they may make industry participants feel better about their systems; they may make them feel more in control. In the extrogenous model, external enemies of the system are to blame for misfortune, a response that can be observed even today in the demotion or exile of failed operators: pilots or controllers or technicians. These people are expost facto relegated to a kind of underclass that no longer represents the professional corps. Firing them is one option, and is used relatively often. But there are more subtle expressions of the extrogenous model too. The ritualistic expropriation of badges, certificates, stripes, licenses, uniforms, or other identity and status markings in the wake of an accident delegitimizes the errant operator as a member of the operational community. A part of such derogation, of course, is psychological defense on the part of (former) colleagues who would need to distance themselves from a realization of equal vulnerability to similar failures. Yet such delegitimization also makes criminalization easier by beginning the incremental process of dehumanizing the operator in question. Wilkinson (1994) presented an excellent example of such demonizing in the consequences that befell a Boeing 747 pilot after allegedly narrowly missing a hotel at London Heathrow airport in thick fog. Demonizing there was incremental in the sense that it made criminal pursuit not only possible in the first place, but subsequently necessary. It fed on itself: Demons such as this pilot would need to be punished, demoted, exorcised. The press had a large share in dramatizing the case, promoting the captain's dehumanization to the point where his suicide was the only way out. 198 CHAPTER 10 Failure and Fear Today, almost every misfortune is followed by questions centering on "whose fault?" and "what damages, compensation?" Every death must be chargeable to somebody's account. Such responses approximate the primitives' resistance to the idea of natural death remarkably well (Douglas, 1992). Death, even today, is not considered natural—it has to arise from some type of identifiable cause. Such resistance to the notion that deaths actually can be accidental is obvious in responses to recent mishaps. For example, Snook (2000) commented on his own disbelief, his struggle, in analyzing the friendly shoot-down of two U.S. Black Hawk helicopters by U.S. Fighter Jets over Northern Iraq in 1993: This journey played with my emotions. When I first examined the data, I went
  • 123. in puzzled, angry, and disappointed—puzzled how two highly trained Air Force pilots could make such a deadly mistake; angry at how an entire crew of AWACS controllers could sit by and watch a tragedy develop without taking action; and disappointed at how dysfunctional Task Force OPC must have been to have not better integrated helicopters into its air operations. Each time I went in hot and suspicious. Each time I came out sympathetic and unnerved. ... If no one did anything wrong; if there were no unexplainable surprises at any level of analysis; if nothing was abnormal from a behavioral and organizational perspective; then what have we learned? (p. 203) Snook (2000) confronted the question of whether learning, or any kind of progress on safety, is possible at all if we can find no wrongdoing, no surprises, if we cannot find some kind of deviance. If everything was normal, then how could the system fail? Indeed, this must be among the greater fears that define Western society today. Investigations that do not turn up a "Eureka part," as the label became in the TWA800 probe, are feared not because they are bad investigations, but because they are scary. Philosophers like Nietzsche pointed out that the need for finding a cause is fundamental to human nature. Not being able to find a cause is profoundly distressing; it creates anxiety because it implies a loss of control. The desire to find a cause is driven by fear. So what do we do if there is no Eureka part, no fault nucleus, no seed of destruction? Is it possible to acknowledge that failure results from normal people doing business as usual in normal organizations? Not even many accident investigations succeed at this. As Galison (2000) noted: If there is no seed, if the bramble of cause, agency, and procedure does not issue from a fault nucleus, but is rather unstably perched between scales, between human and non-human, and between protocol and judgment, then the world is a more disordered and dangerous place. Accident reports, and 199 HOLDING PEOPLE ACCOUNTABLE much of the history we write, struggle, incompletely and unstably, to hold that nightmare at bay. (p. 32) Galison's (2000) remarks remind us of this fear (this nightmare) of not being in control over the systems we design, build, and operate. We dread the possibility that failures emerge from the intertwined complexity of normal everyday systems interactions. We would rather see failures emanate from a traceable, controllable single seed or nucleus. In assigning cause, or in identifying our imagined core of failure, accuracy does not seem to matter. Being afraid is worse than being wrong. Selecting a scapegoat to carry the interpretive load of an accident or incident is the easy price we pay for our illusion that we actually have control over our risky technologies. This price is the inevitable side effect of the centuries-old pursuit of Baconian control and technological domination over nature. Sending controllers, or pilots, or maintenance technicians to jail may be morally wrenching (but not unequivocally so—remember North, 2000), but it is preferable over its scary alternative: acknowledging that we do not enjoy control over the risky technologies we build and consume. The alternative would force us to really admit that failure is an emergent property, that "mistake, mishap and disaster are socially organized and systematically produced by social structures," that these mistakes are normal, to be expected because they are "embedded in the banality of organizational life" (Vaughan, 1996, p. xiv). It would force us to acknowledge the relentless inevitability of mistake in organizations, to see that harmful outcomes can occur in the organizations constructed to prevent them, that harmful consequences can occur even when everybody follows the rules. Preferring to be wrong over being afraid in the identification of cause overlaps with the common reflex toward individual responsibility in the West. Various transportation modes (particularly aviation) have exported this bias to less individually oriented cultures as well. In the Western intellectual tradition since the Scientific Revolution, it has seemed self-evident to evaluate ourselves as individuals, bordered by the limits of our minds and
  • 124. bodies, and evaluated in terms of our own personal achievements. From the Renaissance onward, the individual became a central focus, fueled in part by Descartes' psychology that created "self-contained individuals" (Heft, 2001). The rugged individualism developed on the back of mass European immigration into North America in the late 19th and early 20th centuries accelerated the image of independent, free heroes accomplishing greatness against all odds, and antiheroes responsible for disproportionate evildoing (e.g., Al Capone). Lone antiheroes still play the lead roles in our stories of failure. The notion that it takes teamwork, or an entire organization, an entire industry (think about Alaska 261) to break a system is just too eccentric relative to this cultural prejudice. 200 CHAPTER 10 There are earlier bases for the dominance of individualism in Western traditions as well. Saint Augustine, the deeply influential moral thinker for Judeo-Christian societies, saw human suffering as occurring not only because of individual human fault (Pagels, 1988), but because of human choice, the conscious, deliberate, rational choice to err. The idea of a rational choice to err is so pervasive in Western thinking that it goes virtually unnoticed, unquestioned, because it makes such common sense. The idea, for example, is that pilots have a choice to take the correct runway but fail to take it. Instead, they make the wrong choice because of attentional deficiencies or motivational shortcomings, despite the cues that were available and the time they had to evaluate those cues. Air-traffic controllers have a choice to see a looming conflict, but elect to pay no attention to it because they think their priorities should be elsewhere. After the fact, it often seems as if people chose to err, despite all available evidence indicating they had it wrong. The story of Adam's original sin, and especially what Saint Augustine made of it, reveals the same space for conscious negotiation that we retrospectively invoke on behalf of people carrying out safety-critical work in real conditions. Eve had a deliberative conversation with the snake on whether to sin or not to sin, on whether to err or not to err. The allegory emphasizes the same conscious presence of cues and incentives to not err, of warnings to follow rules and not sin, and yet Adam and Eve elected to err anyway. The prototypical story of error and violation and its consequences in Judeo- Christian tradition tells of people who were equipped with the requisite intellect, who had received the appropriate indoctrination (don't eat that fruit), who displayed capacity for reflective judgment, and who actually had the time to choose between a right and a wrong alternative. They then proceeded to pick the wrong alternative, a choice that would make a big difference for their lives and the lives of others. It is likely that, rather than causing the fall into continued error, as Saint Augustine would have it, Adam's original sin portrays how we think about error, and how we have thought about it for ages. The idea of free will permeates our moral thinking, and most probably influences how we look at human performance to this day. MISMATCH BETWEEN AUTHORITY AND RESPONSIBILITY Of course this illusion of free will, though dominant in post hoc analyses of error, is at odds with the real conditions under which people perform work: where resource limitations and uncertainty severely constrain the choices open to them. Van den Hoven (2001) called this "the pressure condition." Operators such as pilots and air-traffic controllers are "narrowly embed- 201 HOLDING PEOPLE ACCOUNTABLE ded"; they are "configured in an environment and assigned a place which will provide them with observational or derived knowledge of relevant facts and states of affairs" (p. 3). Such environments are exceedingly hostile to the kind of reflection necessary to meet the regulative ideal of individual moral responsibility. Yet this is exactly the kind of reflective idyll we read in the story of Adam and Eve and the kind we retrospectively presume on behalf of operators in difficult situations that led to a mishap.
  • 125. Human factors refers to this as an authority-responsibility double bind: A mismatch occurs between the responsibility expected of people to do the right thing, and the authority given or available to them to live up to that responsibility. Society expresses its confidence in operators' responsibility through payments, status, symbols, and the like. Yet operators' authority may fall short of that responsibility in many important ways. Operators typically do not have the degrees of freedom assumed by their professional responsibility because of a variety of reasons: Practice is driven by multiple goals that may be incompatible (simultaneously having to achieve maximum capacity utilization, economic aims, customer service, and safety). As Wilkinson (1994, p. 87) remarked: "A lot of lip service is paid to the myth of command residing in the cockpit, to the fantasy of the captain as ultimate decision-maker. But today the commander must first consult with the accountant." Error, then, must be understood as the result of constraints that the world imposes on people's goal-directed behavior. As the local rationality principle dictates, people want to do the right thing, yet features of their work environment limit their authority to act, limit their ability to live up to the responsibility for doing the right thing. This moved Claus Jensen (1996) to say: there is no longer any point in appealing to the individual worker's own sense of responsibility, morality or decency, when almost all of us are working within extremely large and complex systems . . . According to this perspective, there is no point in expecting or demanding individual engineers or managers to be moral heroes; far better to put all of one's efforts into reinforcing safety procedures and creating structures and processes conducive to ethical behavior, (p. xiii) Individual authority, in other words, is constrained to the point where moral appeals to individual responsibility are becoming useless. And authority is not only restricted because of the larger structures that people are only small parts of. Authority to assess, decide, and act can be in limited simply because of the nature of the situation. Time and other resources for making sense of a situation are lacking; information may not be at hand or may be ambiguous; there may be all kinds of subtle organizational pressures to prefer certain actions over others; and there may be no neutral or additional expertise to draw on. Even Eve was initially alone with the snake. 202 CHAPTER 10 Where was Adam, the only other human available in paradise during those critical moments of seduction into error? Only recent additions to the human factors literature (e.g., naturalistic decision making, ecological task analyses) explicitly took these and other constraints on people's practice into consideration in the design and understanding of work. Free will is a logical impossibility in cases where there is a mismatch between responsibility and authority, which is to say that free will is always a logical impossibility in real settings where real safety-critical work is carried out. This should invert the culpability criterion when operators or others are being held accountable for their errors. Today it is typically the defendant who has to explain that he or she was constrained in ways that did not allow adequate control over the situation. But such defenses are often hopeless. Outsider observers are influenced by hindsight when they look back on available data and choice moments. As a consequence, they consistently overestimate both the clarity of the situation and the ability to control the outcome. So rather than the defendant having to show that insufficient data and control made the outcome inevitable, it should be up to the claimants, or prosecution, to prove that adequate control was in fact available. Did people have enough authority to live up to their responsibility? Such a proposal, however, amounts to only a marginal adjustment of what may still be dysfunctional and counterproductive accountability relationships. What different models of responsibility could possibly replace current accountability relationships, and do they have any chance? In the adversarial confrontations and defensive posturing that the criminalization of error generates
  • 126. today, truth becomes fragmented across multiple versions that advocate particular agendas (staying out of jail, limiting corporate liability). This makes learning from the mishap almost impossible. Even making safety improvements in the wake of an accident can get construed as an admission of liability. This robs systems of their most concrete demonstration that they have learned something from the mishap: an actual implementation of lessons learned. Indeed, lessons are not learned before organizations have actually made the changes that those lessons prescribe. BLAME-FREE CULTURES? Ideally, there should be accountability without invoking defense mechanisms. Blame-free cultures, for example, though free from blame and associated protective plotting, are not without member responsibility. But blamefree cultures are extremely rare. Examples have been found among Sherpas in Nepal (Douglas, 1992), who pressure each other to settle quarrels peacefully and reduce rivalries with strong informal procedures for reconciliation. Laying blame accurately is considered much less important than a generous 203 HOLDING PEOPLE ACCOUNTABLE treatment of the victim. Sherpas irrigate their social system with a lavish flow of gifts, taxing themselves collectively to ensure nobody goes neglected, and victims are not left exposed to impoverishment or discrimination (Douglas). This mirrors the propensity of Scandinavian cultures for collective taxation to support dense webs of social security. Prosecution of individuals or especially civil lawsuits in the wake of accidents are rare. U.S. responses stand in stark contrast (although criminal prosecution of operators is rare there). Despite a plenitude of litigation (which inflates and occasionally exceeds the compensatory expectations of a few), victims as a group are typically undercompensated. Blame-free cultures may hinge more on consistently generous treatment of victims than on denying that professional accountability exists. They also hinge on finding other expressions of responsibility, of what it means to be a responsible member of that culture. Holding people accountable can be consistent with being blame-free if transportation industries think in novel ways about accountability. This would involve innovations in relationships among the various stakeholders. Indeed, in order to continue making progress on safety, transportation industries should reconsider and reconstruct accountability relationships between its stakeholders (organizations, regulators, litigators, operators, passengers). In a new form of accountability relationships, operators or managers involved in mishaps could be held accountable by inviting them to tell their story (their account). Such accounts can then be systematized and distributed, and used to propagate vicarious learning for all. Microversions of such accountability relationships have been implemented in many incident-reporting systems, and perhaps their examples could move industries in the direction of as yet elusive blame-free cultures. The odds, however, may be stacked against attempts to make such progress. The Judeo-Christian ethic of individual responsibility is not just animated by a basic Nietzschean anxiety of losing control. Macrostructural forces are probably at work too. There is evidence that episodes of renewed enlightenment, such as the Scientific Revolution, are accompanied by violent regressions toward supernaturalism and witch hunting. Prima facie, this would be an inconsistency. How can an increasingly illuminated society simultaneously retard into superstition and scapegoating? One answer may lie in the uncertainties and anxieties brought on by the technological advances and depersonalization that inevitably seem to come with such progress. New, large, complex, and widely extended technological systems (e.g., global aviation that took just a few decades to expand into what it is today) create displacement, diffusion, and causal uncertainty. A reliance on individual culpability may be the only sure way of recapturing an illusion of control. In contrast, less technologically or industrially developed societies (take the Sherpas as example again) appear to rely on more benign models of failure and blame, and more on collective responsibility.
  • 127. 204 CHAPTER 10 In addition, those who do safety-critical work often tie culpability conventions to aspects of their personal biographies. Physician Atul Gawande (2002, p. 73), for example, commented on a recent surgical incident and observed that terms such as systems problems are part of a "dry language of structures, not people . . . something in me, too, demands an acknowledgement of my autonomy, which is also to say my ultimate culpability ... although the odds were against me, it wasn't as if I had no chance of succeeding. Good doctoring is all about making the most of the hand you're dealt, and I failed to do so." The expectation of being held accountable if things go wrong (and, conversely, being responsible if things go right) appears intricately connected to issues of self-identity, where accountability is the other side of professional autonomy and a desire for control. This expectation can engender considerable pride and can make even routine operational work deeply meaningful. But although good doctoring (or any kind of practice) may be making the most of the hand one is dealt, human factors has always been about providing that hand more and better opportunities to do the right thing. Merely leaving the hand with what it is dealt and banking on personal motivation to do the rest takes us back to prehistoric times, when behaviorism reigned and human factors had yet to make its entry in system safety thinking. Accountability and culpability are deeply complex concepts. Disentangling their prerational influences in order to promote systems thinking, and to create an objectively fairer, blame-free culture, may be an uphill struggle. They are, in any case, topics worthy of more research. References Aeronautica Civil. (1996). Aircraft accident report: Controlled flight into terrain, American Airlines flight 965, Boeing 757-223, N651AA nearCali, Colombia, December 20, 1995. Bogota, Colombia: Author. Airliner World. (2001, November). Excel, pp. 77-80. Air Transport Association of America. (1989, April). National plan to enhance aviation safety through human factors improvements. Washington, DC: Author. Albright, C. A., Truitt, T. R., Barile, A. B., Vortac, O. U., & Manning, C. A. (1996). How controllers compensate for the lack of flight progress strips (Final Rep. No. DOT/FAA/AM-96/5). Arlington, VA: National Technical Information Service. Amalberti, R. (2001). The paradoxes of almost totally safe transportation systems. Safety Science, 37, 109-126. Angell, I. O., & Straub, B. (1999). Rain-dancing with pseudo-science. Cognition, Technology and Work, 1, 179-196. Baiada, R. M. (1995). ATC biggest drag on airline productivity. Aviation Week and Space Technology, 31, 51-53. Bainbridge, L. (1987). Ironies of automation. InJ. Rasmussen, K. Duncan, &J. Leplat (Eds.), New technology and human error (pp. 271-283). Chichester, England: Wiley. Batteau, A. W. (2001). The anthropology of aviation and flight safety. Human Organization, 60(3), 201-210. Beyer, H., & Holtzblatt, K. (1998). Contextual design: Defining customer-centered systems. San Diego, CA: Academic Press. Billings, C. E. (1996). Situation awareness measurement and analysis: A commentary. In D. J. Garland & M. R. Endsley (Eds.), Experimental analysis and measurement of situation awareness (pp. 1-5). Daytona Beach, FL: Embry-Riddle Aeronautical University Press. Billings, C. E. (1997). Aviation automation: The search for a human-centered approach. Mahwah, NJ: Lawrence Erlbaum Associates. Bjorklund, C., Alfredsson, J., & Dekker, S. W. A. (2003). Shared mode awareness of the FMA in commercial aviation: An eye-point of gaze and communication data analysis in a highfidelity simulator. In E. Hollnagel (Ed.), Proceedings ofEAM 2003, The 22ndEuropean Confer205 206 REFERENCES ence on Human Decision Making and Manual Control (pp. 119—126). Linkoping, Sweden: Cognitive Systems Engineering Laboratory, Linkoping University. Boeing Commercial Airplane Group. (1996). Boeing submission to the American Airlines Flight 965
  • 128. Accident Investigation Board. Seattle, WA: Author. Bruner, J. (1990). Acts of meaning. Cambridge, MA: Harvard University Press. Campbell, R. D., & Bagshaw, M. (1991). Human performance and limitations in aviation. Oxford, England: Blackwell Science. Capra, F. (1982). The turning point. New York: Simon & Schuster. Carley, W. M. (1999, January 21). Swissair pilots differed on how to avoid crash. The Wall Street Journal. Columbia Accident Investigation Board. (2003). Report Volume 1, August 2003. Washington, DC: U.S. Government Printing Office. Cordesman, A. H., & Wagner, A. R. (1996). The lessons of modern war: Vol. 4. The Gulf War. Boulder, CO: Westview Press. Croft, J. (2001, July 16). Researchers perfect new ways to monitor pilot performance. Aviation Week and Space Technology, pp. 76-77. Dawkins, R. (1986). The blind watchmaker. London: Penguin. Degani, A., Heymann, M., & Shafto, M. (1999). Formal aspects of procedures: The problem of sequential correctness. In Proceedings of the 43rd Annual Meeting of the Human Factors and Ergonomics Society. Houston, TX: Human Factors Society. Dekker, S. W. A. (2002). The field guide to human error investigations. Bedford, England: Cranfield University Press. Dekker, S. W. A., & Woods, D. D. (1999). To intervene or not to intervene: The dilemma of management by exception. Cognition, Technology and Work, 1, 86-96. Delia Rocco, P. S., Manning, C. A., & Wing, H. (1990). Selection of air traffic controllers for automated systems: Applications from current research (DOT/FAA/AM-90/13). Arlington, VA: National Technical Information Service. Dorner, D. (1989). The logic of failure: Recognizing and avoiding error in complex situations. Cambridge, MA: Perseus Books. Douglas, M. (1992). Risk and blame: Essays in cultural theory. London: Routledge. Endsley, M. R., Mogford, M., Allendoerfer, K., & Stein, E. (1997). Effect of free flight conditions on controller performance, workload and situation awareness: A preliminary investigation of changes in locus of control using existing technologies. Lubbock, TX: Texas Technical University. Feyerabend, P. (1993). Against method (3rd ed.). London: Verso. Feynman, R. P. (1988). "What do you care what other people think?": Further adventures of a curious character. New York: Norton. Fischoff, B. (1975). Hindsight is not foresight: The effect of outcome knowledge on judgement under uncertainty. Journal of Experimental Psychology: Human Perception and Performance, 7(3), 288-299. Fitts, P. M. (1951). Human engineering for an effective air navigation and traffic control system. Washington, DC: National Research Council. Fitts, P. M., & Jones, R. E. (1947). Analysis of factors contributing to 460 "pilot error" experiences in operating aircraft controls (Memorandum Rep. No. TSEAA-694-12). Dayton, OH: Aero Medical Laboratory, Air Material Command, Wright-Patterson Air Force Base. Flores, F., Graves, M., Hartfield, B., & Winograd, T. (1988). Computer systems and the design of organizational interaction. ACM Transactions on Office Information Systems, 6, 153-172. Galison, P. (2000). An accident of history. In P. Galison & A. Roland (Eds.), Atmospheric /light in the twentieth century (pp. 3-44). Dordrecht, The Netherlands: Kluwer Academic. Galster, S. M., Duley, J. A., Masolanis, A. J., & Parasuraman, R. (1999). Effects of aircraft selfseparation on controller conflict detection and workload in mature Free Flight. In M. W. REFERENCES 207 Scerbo & M. Mouloua (Eds.), Automationtechnology and human performance: Current research and trends (pp. 96-101). Mahwah, NJ: Lawrence Erlbaum Associates. Gawande, A. (2002). Complications: A surgeon's notes on an imperfect science. New York: Picado. Geertz, C. (1973). The interpretation of cultures. New York: Basic Books. Golden-Biddle, K., & Locke, K. (1993). Appealing work: An investigation of how ethnographic texts convince. Organization Science, 4, 595-616. Heft, H. (2001). Ecological psychology in context:James Gibson, Roger Barker, and the legacy of William James's radical empiricism. Mahwah, NJ: Lawrence Erlbaum Associates. Helmreich, R. L. (2000). On error management: Lessons from aviation. British Medical Journal, 320, 745-753. Helmreich, R. L., Klinect, J. R., & Wilhelm, J. A. (1999). Models of threat, error and response in flight operations. In R. S.Jensen (Ed.), Proceedings of the 10th International Symposium on
  • 129. Aviation Psychology. Columbus: The Ohio State University. Hollan, J., Hutchins, E., & Kirsh, D. (2000). Distributed cognition: Toward a new foundation for human-computer interaction research. ACM Transactions on Computer-Human Interaction, 7(2), 174-196. Hollnagel, E. (1999). From function allocation to function congruence. In S. W. A. Dekker & E. Hollnagel (Eds.), Coping with computers in the cockpit (pp. 29-53). Aldershot, England: Ashgate. Hollnagel, E. (Ed.). (2003). Handbook of cognitive task design. Mahwah, NJ: Lawrence Erlbaum Associates. Hollnagel, E., & Amalberti, R. (2001). The emperor's new clothes: Or whatever happened to "human error"? In S. W. A. Dekker (Ed.), Proceedings of the 4th International Workshop on Human Error, Safety and Systems Development (pp. 1-18). Linkoping, Sweden: Linkoping University. Hollnagel, E., & Woods, D. D. (1983). Cognitive systems engineering: New wine in new bottles. International Journal of Man-Machine Studies, 18, 583-600. Hughes, J. A., Randall, D., & Shapiro, D. (1993). From ethnographic record to system design: Some experiences from the field. Computer Supported Collaborative Work, 1, 123—141. International Civil Aviation Organization. (1998). Human factors training manual (ICAO Doc. No. 9683-AN/950). Montreal, Quebec: Author. Jensen, C. (1996). No downlink: A dramatic narrative about the Challenger accident and our time. New York: Farrar, Strauss, Giroux. Joint Aviation Authorities. (2001). Human factors in maintenance working group report. Hoofddorp, The Netherlands: Author. Joint Aviation Authorities. (2003). Advisory CircularJoint ACJ 25.1329: Flight guidance system, Attachment 1 to NPA (Notice of Proposed Amendment) 25F-344. Hoofddorp, The Netherlands: Author. Kern, T. (1998). Flight discipline. New York: McGraw-Hill. Klein, G. A. (1998). Sources of power: How people make decisions. Cambridge, MA: MIT Press. Kohn, L. T., Corrigan, J. M., & Donaldson, M. (Eds.). (1999). To err is human: Building a safer health system. Washington, DC: Institute of Medicine. Kuhn, T. S. (1962). The structure of scientific revolutions. Chicago: University of Chicago Press. Langewiesche, W. (1998). Inside the sky: A meditation on flight. New York: Pantheon Books. Lanir, Z. (1986). Fundamental surprise. Eugene, OR: Decision Research. Lautman, L., & Gallimore, P. L. (1987). Control of the crew caused accident: Results of a 12operator survey. Boeing Airliner, April-June, 1-6. Lerner, J. S., & Tetlock, P. E. (1999). Accounting for the effects of accountability. Psychological Bulletin, 125, 255-275. 208 REFERENCES Leveson, N. (2002). A new approach to system safety engineering. Cambridge, MA: Aeronautics and Astronautics, Massachusetts Institute of Technology. Mackay, W. E. (2000). Is paper safer? The role of paper flight strips in air traffic control. ACM/ Transactions on Computer-Human Interactions, 6, 311-340. McDonald, N., Corrigan, S., & Ward, M. (2002, June). Well-intentioned people in dysfunctional systems. Keynote paper presented at the 5th Workshop on Human Error, Safety and Systems Development, Newcastle, Australia. Meister, D. (2003). The editor's comments. Human Factors Ergonomics Society COTG Digest, 5, 2-6. Metzger, U., & Parasuraman, R. (1999). Free Flight and the air traffic controller: Active control versus passive monitoring. In Proceedings of the Human Factors and Ergonomics Society 43rd annual meeting. Houston, TX: Human Factors Society. Mumaw, R. J., Sarter, N. B., & Wickens, C. D. (2001). Analysis of pilots' monitoring and performance on an automated flight deck. In Proceedings of 11th International Symposium in Aviation Psychology. Columbus: Ohio State University. National Aeronautics and Space Administration. (2000, March). Report onproject management in NASA, by the Mars Climate Orbiter Mishap Investigation Board. Washington, DC: Author. National Transportation Safety Board. (1974). Delta Air Lines Douglas DC-9-31, Boston, MA, 7/31/73 (NTSB Rep. No. AAR-74/03). Washington, DC: Author. National Transportation Safety Board. (1995). Aircraft accident report: Flight into terrain during missed approach, USAir flight 1016, DC-9-31, N954VJ, Charlotte Douglas International Airport, Charlotte, North Carolina, July 2, 1994 (NTSB Rep. No. AAR-95/03). Washington, DC: Author. National Transportation Safety Board. (1997). Grounding of thePanamanian passenger shipRoya Majesty on Rose and Crown shoal near Nantucket, Massachusetts, June 10, 1995 (NTSB Rep. No.
  • 130. MAR-97/01). Washington, DC: Author. National Transportation Safety Board. (2002). Loss of control and impact with Pacific Ocean, Alaska Airlines Flight 261 McDonnell Douglas MD-83, N963AS, about 2.7 miles north ofAnacapa Island, California, January 31, 2000 (NTSB Rep. No. AAR-02/01). Washington, DC: Author. Neisser, U. (1976). Cognition and reality: Principles and implications of cognitive psychology. San Francisco: Freeman Press. North, D. M. (2000, May 15). Letjudicial system run its course in crash cases. Aviation Week and Space Technology, p. 66. North, D. M. (2002, February 4). Oil and water, cats and dogs. Aviation Week and Space Technology, p. 70. O'Hare, D., & Roscoe, S. (1990). Flightdeck performance: The human factor. Ames: Iowa State University Press. Orasanu, J. M. (2001). The role of risk assessment in flight safety: Strategies for enhancing pilot decision making. In Proceedings of the 4th International Workshop on Human Error, Safety and Systems Development (pp. 83-94). Linkoping, Sweden: Linkoping University. Orasanu, J. M., & Connolly, T. (1993). The reinvention of decision making. In G. A. Klein, J. Orasanu, R. Calderwood, & C. E. Zsambok (Eds.), Decision making in action: Models and methods (pp. 3-20). Norwood, NJ: Ablex. Pagels, E. (1988). Adam, Eve and the serpent. London: Weidenfeld & Nicolson. Parasuraman, R., Molly, R., & Singh, I. (1993). Performance consequences of automationinduced complacency. The International Journal of Aviation Psychology, 3(1) , 1-23. Parasuraman, R., Sheridan, T. B., & Wickens, C. D. (2000). A model for types and levels of human interaction with automation. IEEE transactions on systems, man, and cybernetics— Part A: Systems and Humans. Systems and Humans, 30, 286-297. Perrow, C. (1984). Normal accidents: Living with high-risk technologies. New York: Basic Books. Rasmussen, J., & Svedung, I. (2000). Proactive risk management in a dynamic society. Karlstad, Sweden: Swedish Rescue Services Agency. REFERENCES 209 Reason, J. T. (1990). Human error. Cambridge, England: Cambridge University Press. Reason, J. T., & Hobbs, A. (2003). Managing maintenance error: A practical guide. Aldershot, England: Ashgate. Rochlin, G. I. (1999). Safe operation as a social construct. Ergonomics, 42, 1549-1560. Rochlin, G. I., LaPorte, T. R., & Roberts, K. H. (1987). The self-designing high-reliability organization: Aircraft carrier flight operations at sea. Naval War College Review, Autumn 1987. Ross, G. (1995). Flight strip survey report. Canberra, Australia: TAAATS TOI. Sacks, O. (1998). The man who mistook his wife for a hat. New York: Touchstone. Sanders, M. S., & McCormick, E. J. (1997). Human factors in engineering and design (7th ed.). New York: McGraw-Hill. Sarter, N. B., & Woods, D. D. (1997). Teamplay with a powerful and independent agent: A corpus of operational experiences and automation surprises on the Airbus A320. Human Factors, 39, 553-569. Shappell, S. A., & Wiegmann, D. A. (2001). Applying reason: The human factors analysis and classification system (HFACS). Human Factors and Aerospace Safety, 1, 59-86. Singer, G., & Dekker, S. W. A. (2000). Pilot performance during multiple failures: An empirical study of different warning systems. Journal of Transportation Human Factors, 2, 63-76. Smith, K. (2001). Incompatible goals, uncertain information and conflicting incentives: The dispatch dilemma. Human Factors and Aerospace Safety, 1, 361—380. Snook, S. A. (2000). Friendly fire: The accidental shootdown of US Black Hawks over Northern Iraq. Princeton, NJ: Princeton University Press. Starbuck, W. H., & Milliken, F. J. (1988). Challenger: Fine-tuning the odds until something breaks. Journal of Management Studies, 25, 319-340. Statens Haverikommision [Swedish Accident Investigation Board]. (2000). Tillbud vid landning med flygplanet LN-RLF den 23/6 pa Vaxjo/Kronoberg flygplats, G Ian (Rapport RL 2000:38) [Incident during landing with aircraft LN-RLF on June 23 at Vaxjo/Kronoberg airport]. Stockholm, Sweden: Author. Statens Haverikommision [Swedish Accident Investigation Board]. (2003). Tillbud mellan flygplanet LN-RPL och en bogsertraktor pa Stockholm/Arlanda flygplats, AB Ian, den 27 oktober 2002 (Rapport RL 2003:47) [Incident between aircraft LN-RPL and a tow-truck at Stockholm/ Arlanda airport, October 27, 2002]. Stockholm, Sweden: Author. Suchman, L. A. (1987). Plans and situated actions: The problem of human-machine communication.
  • 131. Cambridge, England: Cambridge University Press. Tuchman, B. W. (1981). Practicing history: Selected essays. New York: Norton. Turner, B. (1978). Man-made disasters. London: Wykeham. Varela, F. J., Thompson, E., & Rosch, E. (1991). The embodied mind: Cognitive science and human experience. Cambridge, MA: MIT Press. Vaughan, D. (1996). The Challenger lauch decision: Risky technology, culture and deviance at NASA. Chicago: University of Chicago Press. Vaughan, D. (1999). The dark side of organizations: Mistake, misconduct, and disaster. Annual Review of Sociology, 25, 271—305. van den Hoven, M. J. (2001). Moral responsibility and information technology. Rotterdam, The Netherlands: Erasmus University Center for Philosophy of ICT. Vicente, K. (1999). Cognitive work analysis: Toward safe, productive, and healthy computer-based work. Mahwah, NJ: Lawrence Erlbaum Associates. Weick, K. E. (1993). The collapse of sensemaking in organizations. Administrative Science Quarterly, 38, 628-652. Weick, K. E. (1995). Sensemaking in organizations. London: Sage. Weingart, P. (1991). Large technical systems, real life experiments, and the legitimation trap of technology assessment: The contribution of science and technology to constituting risk perception. In T. R. LaPorte (Ed.), Social responses to large technialsystems: Control or anticipation (pp. 8-9). Amsterdam: Kluwer. 210 REFERENCES Wiener, E. L. (1988). Cockpit automation. In E. L. Wiener & D. C. Nagel (Eds.), Humanf actors in aviation (pp. 433—462). San Diego, CA: Academic Press. Wilkinson, S. (1994, February-March). The Oscar November incident. Air & Space, 80-87. Woods, D. D. (1993). Process-tracing methods for the study of cognition outside of the experimental laboratory. In G. A. Klein,J. Orasanu, R. Calderwood, & C. E. Zsambok (Eds.),Decision making in action: Models and methods (pp. 228-251). Norwood, NJ: Ablex. Woods, D. D. (2003, October 29). Creating foresight: How resilience engineering can transform NASA's approach to risky decision making. Hearing before the U.S. Senate Committee on Commerce, Science and Transportation, John McCain, chair, Washington, DC. Woods, D. D., & Dekker, S. W. A. (2001). Anticipating the effects of technology change: A new era of dynamics for Human Factors. Theoretical Issues in Ergonomics Science, 1, 272-282. Woods, D. D.,Johannesen, L. J., Cook, R. L, & Sarter, N. B. (1994). Behind human error: Cognitive systems, computers and hindsight. Dayton, OH: CSERIAC. Woods, D. D., Patterson, E. S., & Roth, E. M. (2002). Can we ever escape from data overload? A cognitive systems diagnosis. Cognition, Technology, and Work, 4, 22—36. Woods, D. D., & Shattuck, L. G. (2000). Distant supervision: Local action given the potential for surprise. Cognition Technology and Work, 2(4), 242-245. Wright, P. C., & McCarthy, J. (2003). Analysis of procedure following as concerned work. In E. Hollnagel (Ed.), Handbook of cognitive task design (pp. 679-700). Mahwah, NJ: Lawrence Erlbaum Associates. Wynne, B. (1988). Unruly technology: Practical rules, impractical discourses, and public understanding. Social Studies of Sciences, 18, 147-167. Xiao, Y., & Vicente, K. J. (2000). A framework for epistemological analysis in empirical (laboratory and field) studies. Human Factors, 42, 87-101. Yerkes, R. M., & Dodson, J. D. (1908). The relation of strength of stimulus to rapidity of habitformation. Journal of Comparative and Neurological Psychology, 18, 459—482. Author Index A Capra, F., 10, 29, 51 Carley, W. M., 136, 139 Aeronautica Civil, 68, 69, 72, 129, 131 Columbia Accident Investigation Board Airliner World, 146 (CAIB), 40, 41, 42 Air Transport Association of America Connolly, T., 79 (ATA), 151 Cordesman, A. H., 163 Albright, C. A., 174, 178-184 Corrigan, J. M., 46 Alfredsson, J., 158, 159, 160, 161 Corrigan, S., 132, 135, 137, 141, 143, 147, Allendorfer, K., 165 148 Amalberti, R., 17, 28, 52, 53, 54, 61 Croft, J., 47, 50, 53, 59 Angell, I. O., 50, 60, 61, 85 D
  • 132. B Dawkins, R., 187, 188 Bagshaw, M., 127 Degani, A., 11 Baiada, R. M., 167 Dekker, S. W. A., 14, 87, 154, 155, 158, Bainbridge, L., 162, 165 159, 160, 161, 163, 165, 166, 167 Barile, A. B., 174, 178-184 Delia Rocco, P. S., 188, 189 Batteau, A. W., 179, 190 Dodson,J. D., 130 Beyer, H., 188 Donaldson, M., 46 Billings, C. E., 125, 155 Dorner, D., 143-144, 146 Bjorklund, C., 158, 159, 160, 161 Douglas, M., 197, 198, 202-203 Boeing Commercial Airplane Group, 68 Duley.J. A., 165 Bruner.J., 105, 106 C E Campbell, R. D., 127 Endsley, M. R., 165 211 212 F Feyerabend, P., 54-55, 56, 58, 59, 65 Feynman, R. P., 35, 39, 41 Fischoff, B., 68, 82 Fitts, P. M., 161 Flores, F., 163 G Galison, P., 198-199 Gallimore, P. L., 132, 140 Galster, S. M., 165 Gawande, A., 204 Geertz, C., 190 Golden-Riddle, K., 180, 184, 190 Graves, M., 163 H Hartfield, B., 163 Heft, H., 34, 108, 110, 111, 199 Helmreich, R. L., 49 Heymann, M., 11 Hobbs, A., 58 Hollan,J., 189, 190 Hollnagel, E., 52, 53, 54, 61, 157, 161, 162 Holtzblatt, K., 188 Hughes, J. A., 174, 185, 186, 187, 188 Hutchins, E., 189, 190 I International Civil Aviation Organization (ICAO), 54 J Jensen, C., 36, 201 Joint Aviation Authorities, 134, 160 K Kern, T., 125, 127 Kirsh, D., 189, 190 AUTHOR INDEX Klein, G. A., 79, 169 Kohn, L. T., 46 Kuhn, T. S., 47, 50, 56 L Langewiesche, W., 26 Lanir, Z., 86 LaPorte, T. R., 135 Lautman, L., 132, 140 Lerner,J. S., 196
  • 133. Leveson, N., 4, 16, 24, 34, 35, 36 Locke, K., 180, 184, 190 M Mackay, W. E., 183 Manning, C. A., 174, 178-184, 188, 189 Masolanis, A. J., 165 McCarthy, J., 136, 138, 150 McCormick, E. J., 175 McDonald, N., 132, 135, 137, 141, 143, 147, 148 Metzger, U., 165 Milliken, F.J., 25, 36, 81, 144, 149 Mogford, M., 165 Molly, R., 127 Mumaw, R.J., 160 N National Aeronautics and Space Administration (NASA), 144 National Transportation Safety Board (NTSB), 22, 25, 32, 38, 44, 70, 72, 91, 118, 130, 136, 137, 173 Neisser, U., 79, 111 North, D. M., 195, 196, 199 O O'Hare, D., 127 Orasanu,J. M., 61, 79 213 AUTHOR INDEX P Pagels, E., 200 Parasuraman, R., 127, 161, 162, 165 Patterson, E. S., 108, 152, 156 Perrow, C., 14, 24, 36, 88 R Randall, D., 174, 185, 186, 187, 188 Rasmussen,J., 24, 28, 38 Reason, J. T., 58, 68, 77 Roberts, K. H., 135 Rochlin, G. I., 61, 62, 63, 135, 141 Rosch, E., 51 Roscoe, S., 127 Ross, G., 188, 189, 190 Roth, E. M., 108, 152, 156 s Sacks, O., 87 Sanders, M. S., 175 Sarter, N. B., 160, 169 Shafto, M., 11 Shapiro, D., 174, 185, 186, 187, 188 Shappell, S. A., 49 Shattuck, L. G., 140 Sheridan, T. B., 161, 162 Singer, G., 154, 155 Singh, L, 127 Smith, K., 144 Snook, S. A., 10, 24, 60, 80-81, 83, 84, 133, 149, 198 Starbuck, W. H., 25, 36, 81, 144, 149 Statens Haverikommision, 6, 96, 97, 98 Stein, E., 165 Straub, B., 50, 60, 61, 85
  • 134. Suchman, L. A., 136, 138 Svedung, I., 24, 28, 38 T Tetlock, P. E., 196 Thompson, E., 51 Truitt, T. R., 174, 178-184 Tuchman, B. W., 74 Turner, B., 23 V Van den Hoven, M. J., 41, 200 Varela, F. J., 51 Vaughan, D., 24, 30-31, 36, 39, 44, 54, 75, 79, 86, 147, 148, 149, 179, 190, 199 Vicente, K.J., 75, 157, 176, 180, 186, 188 Vortac, O. U., 174, 178-184 W Wagner, A. R., 163 Ward, M., 132, 135, 137, 141, 143, 147, 148 Weick, K. E., 24, 26, 37, 68, 89, 94, 110, 111, 121-122, 136 Weingart, P., 33 Wickens, C. D., 160, 161, 162 Wiegmann, D. A., 49 Wiener, E. L., 126 Wilkinson, S., 197, 201 Wing, H., 188, 189 Winograd, T., 163 Woods, D. D., 23, 62, 63, 86, 108, 140, 152, 156, 162, 163, 165, 166, 167, 169, 176, 186, 189 Wright, P. C., 136, 138, 150 Wynne, B., 33 X Xiao, Y., 176, 180, 186, 188 Y Yerkes, R. M., 130 This page intentionally left blank Subject Index Note: Figures in italics refer to figures. A Accident rates in doctors versus gun-owners, 46 in transportation systems, 17 Accountability, 195-196, 202-204 Adaptation, to automation, 157-161, 163-164, 172 Airbus test flight crash, 155-156 Aircraft accidents and incidents Airbus test flight crash, 155-156 Alaska 261, 18-24, 32-33, 37, 136-137 Cali, Colombia, crash, 68-70 Halifax crash, 139-140 Heathrow airport incident, 197 helicopter crashes, post Gulf War, 83-84, 198 Logan Airport crash, 129-130 runway overrun, 5—7, 10-14 taxiway incursion, 95-99, 96, 97, 98 TWA 800, 2-3 Valujet, 63-64, 195
  • 135. Washington National crash, 131 Aircraft maintenance goal conflicts in, 148 procedure versus practice in, 134—138 Air-traffic control error study in, 52-54 exception management in, 165-168 flight-progress strips in certifying, 171-172, 772, 174 studies of, 178-191 Air Transportation Oversight System (ATOS), 24-25 Alaska 261 crash drift into failure in, 18-24, 21, 23 hindsight in, 37 procedure versus practice in, 136-137 production and operation gap in, 32-33 Argument from personal incredulity, 187 Association, 100 ATOS, 24-25 Authority-responsibility double bind, 200-202 Automation. See also Technology adaptation to, 157-161, 163-164, 172 data overload in, 152-157 as error solution, 151-152 exception management and, 164-168 strengths and weaknesses in, 161—163 as team player, 169-170 Autonomy principle, 55, 94 B Banality-of-accidents thesis, 24-25, 27-28, 30-31 215 216 Behaviorism, 104—105 Belief, 36, 145 Berkeley, George, 101-102 Black Hawk helicopter crashes, 83-84, 198 Blame, 196-200 Blame-free cultures, 202-204 Bulletized presentations, 38-43 C Cali, Colombia crash, 68-70 Choice, in error, 200, 202 Cockpit voice recording, 73-74 Cognitive revolution, 105 Common-cause hypothesis, 28 Competition, 24-25, 42-43 Complacency, as folk model, 126. See also Folk models Context, in error classification, 60-61 Control models, 34-35 Correspondence, in situational awareness, 93-95 Cost effectiveness, procedure violations and, 149 Counterfactual reasoning, 70-71 Counterinstances, 56 Crew Resource Management (CRM), 6-7,
  • 136. 131 Criminalization of errors, 194-195 CRM, 6-7, 131 D Data overload as clutter problem, 155-156 as workload bottleneck, 152-155 Decision making local rationality in, 38-43, 77-79 versus sensemaking, 79—81 Deconstruction, 2-3 Descartes, Rene, 7-10 Distancing through differencing, 63 Doctors, deaths caused by, 46 Drift into failure in Alaska 261 accident, 18-24, 21, 23 described, 18 local rationality in, 38-43 modeling, 33-35 organizational resilience and, 43-45 SUBJECT INDEX reasons for, 24-30 systems thinking in, 31-33, 35-37 Dualism development of, 7—10 in human factors vocabulary, 3 in mechanical failure versus human error, 5-7 in runway overrun case, 10-14 E Efficiency, procedure violations and, 149 Elements, in situational awareness, 100-104 Emic versus etic perspective, 74—77 Empiricism, 100—104. See also Radical empiricism End-play check drift, 22-23, 23 Error counts and classification inadequacies of, 85-86 observer influence in, 53-54 as safety measurement, 64 safety progress through, 60-61 types of, 47 underlying theories of alternative theory, 56-59 ontological relativism, 52-56, 65-66, 76 postmodernism, 50-51 realism, 48-50, 54, 58 Errors automation and, 141-152 (See also Automation) definitions of, in classification, 49 as ex post facto constructs, 67—70 free will and, 200, 202 mechanical versus human, 5-7 old and new views of, 14-16 punishment and criminalization of, 193-195 Excel airlines, 146 Exception management, 164—168 Ex post facto constructs, 67 F
  • 137. Failure, as cause of failure, 5-7, 17. See also Drift into failure Falsification, 128-130 FDRs, 74 217 SUBJECT INDEX Fine-tuning, in drift into failure, 25-26 Fitness for duty, 3 Flight-data recorders (FDRs), 74 Flight-mode annunciator (FMA) study, 158-161, 159 Flight-progress strips certifying, 171-172, 172, 174 qualitative study of, 184-191 quantitative study of, 178-184 FMA study, 158-161, 159 Folk models complacency as, 126 defined by substitution, 126-128 immunity to falsification, 128—130 versus new models, 129—130 overgeneralization of, 130-131 situational awareness as, 124—126 Free will, 200, 202 Function allocation by substitution, 162 Fundamental surprise error, 86-89 Future incident studies, 164-168 Future predictions, in certification, 173-174, 191-192 G Gestalt movement, 107-108 Goal conflicts, and procedural violations, 143-148 Gun-owner errors, 46 H Halifax crash, 139-140 Heathrow airport incident, 197 Helicopter crashes, post Gulf War, 83-84, 198 Hindsight bias and errors as ex post facto, 67-70 as forward looking, 82—86 versus insight, 36-37 in situational awareness, 90-93, 112 Horizontal stabilizer, on MD-80, 19, 19. See also Jackscrew, in MD-80 Human error. See Error I Immunization, defined, 128 Incident reporting, 27-28, 194 Incrementalism, in drift into failure, 26-27 Individualism, 199-200 Informal work systems, versus procedures, 134-135, 143 Information environments in accident analysis, 73-74 in decision-making, 39—43 Information processing, 104-110 Insight versus hindsight, 36-37 labels mistaken for, 124
  • 138. Inspectors, procedure violations and, 141-143 Introspection, in situational awareness, 102-104 J Jackscrew, in MD-80 drift into failure of, 18-24, 21, 23 procedure versus practice in, 136-137 production and operation gap in, 32-33 safety certification of, 173-174 James, William, 109-110 Judgmentalism, 71-73 K Kepler, Johannes, 100 L Labels, mistaken for insight, 124 Lancaster University project, 185-191 Language. See Vocabulary, in human factors Learning in drift into failure, 25-26 in hindsight bias, 84 versus punishment, 193-194, 202 Local rationality context in, 61 in decision making, 38-43, 77-79 Logan Airport crash, 129-130 Lubrication schedule drift, 20-21, 21 218 M MABA-MABA, 161-163 Mechanical failure, versus human error, 5-7 Medical errors, 46 N NASA goal conflicts in, 144-145 information environments in, 39-43 resource scarcity and, 24 Naturalistic decision making (NDM), 79-80 Navagational incidents, 90-91, 91, 112-122 NDM, 79-80 Neisser, Ulrich, 108 Newall's catch, 164 Normativism, 92-93, 92 o Observers in error classification, 53-54 insider versus outsider, 74—77 Ontological relativism, 52-56, 65-66, 76 Organizational resilience, 43—45 Overgeneralization, 130-131 P Positivism, 48 Postmodernism, 50-51 PowerPoint presentations, 38-43 Pressure, internalization of, 146-148 Pressure condition, 200 Procedures application of models of, 133-134, 139
  • 139. versus practice, 71-73, 134-138, 150 regulator's role in, 141-143 in unusual situations, 139-141 errors or violations of as accident causes, 132-133 goal conflicts and, 143-148 normalization of, 149-150 versus proficiency errors, 47-48 SUBJECT INDEX Proficiency errors, versus procedure errors, 47-48 Psychologist's fallacy, 73 Punishment, for errors, 193-195 Q Qualitative research flight-strip study example, 184-191 versus quantitative, 174—178 Quality management, versus safety management, 44-45 Quantitative research flight-strip study example, 178-184 versus qualitative, 174—178 R Radical empiricism, 110-112 Rationalism, 77. See also Local rationality Realism, 48-50, 54, 58 Regulators, procedure violations and, 141-143 Relativism, 65—66 Research methods in error classification, 49—50 qualitative example, 184-191 quantitative example, 178-184 quantitative versus qualitative, 174-178 Res extensa, 8, 14-16, 107 Resource scarcity, 24-25, 42-43 Responsibility-authority double bind, 200-202 Reverse engineering, 2-3 Royal Majesty cruise ship, 90-91, 91, 112-122 Rules. See Procedures Runway overrun case, 5-7, 10-14 s Safety erosion of, 35—37 error classification and, 60-61 measuring, 62-64 progress on, 193 versus quality management, 44—45 as reflexive, 61—62 success as, 62-63, 149 219 SUBJECT INDEX Safety certification future predictions in, 173-174 191-192 limits of, 172-174 research methods in qualitative example, 184-191 quantitative example, 178-184 quantitative versus qualitative, 174-178
  • 140. Scientific revolution, 7-10 Seawall crash, at Logan Airport, 129-130 Sensemaking versus decision making, 79-81 in situational awareness, 112-113, 121-122 Sherpa culture, 202-203 Situational awareness case studies in navigational incident, 90-91, 91, 112-122 taxiway incursion, 95-99, 96, 97, 98 as folk model, 124-126 (See also Folk models) hindsight and, 90-93, 112 normative characterization of, 92—93, 92 theories of correspondence, 93-95 empiricism, 100-104 information processing, 104-110 radical empiricism, 110-112 Space Shuttle accidents goal conflicts and, 144-145 information environments and, 38-43, 40 Spoilers, in runway overrun case, 5—7, 10-14 Structuralism, 4, 15-16, 29-30 Substitution in folk models, 126-128 of machines for people, 162, 169 (See also Automation) Success, as safety, 62-63, 149 Systems thinking, 31-33, 35-37 T Tailplane, on MD-80, 19, 19. See also Jackscrew, in MD-80 Taxiway incursion case, 95—99, 96, 97, 98 Technology, 73-74, 85, 203. See also Auto mation Threshold-crossing alarms, 167-168 Transportation accident rates, 17 Trim system, in MD-80, 19-20, 20. See also Jackscrew, in MD-80 TWA 800 crash, 2-3 V Valujet accident, 63-64, 195 Vocabulary, in human factors characteristics of, 2-5 limitations of, 1-2, 10 Voice trace, 73-74 w Warning systems, and data overload, 153-156 Washington National crash, 131 Watson, John, 104-105 Wear, in certification, 172-174 Wertheimer, Max, 107-108 Wundt, Wilhelm, 102-104 Wurzburg school, 103
  • 142. TEN QUESTIONS ABOUT HUMAN ERROR: A new view of human factors and system safety Contenidos Reconocimientos Prefacio Introducción de la serie Nota del autor 1 ¿Fue Falla Mecánica o Error Humano? 2 ¿Por qué Fallan Sistemas Seguros? 3 ¿Por qué los Doctores son más Peligrosos que los Propietarios de Armas? 4 ¿No Existen los Errores? 5 Si Ud. Pierde la Conciencia Situacional, ¿Qué la Reemplaza? 6 ¿Por qué los Operadores se vuelven Complacientes? 7 ¿Por qué no siguen ellos los Procedimientos? 8 ¿Podemos Automatizar los Errores Humanos Fuera del Sistema? 9 ¿Va a ser Seguro el Sistema? 10 ¿Debemos Hacer a la Gente Responsable por sus Errores? Referencias Índice del Autor Índice por Objetivo Reconocimientos. Tal como los errores, las ideas vienen de algún lado. Las ideas en este libro fueron desarrolladas en un período de años en que las discusiones con las siguientes personas fueron particularmente constructivas: David Woods, Erik Hollnagel, Nancy Leveson, James Nyce, John Flach, Gary Klein, Diane Vaughan, y Charles Billings. Jens Rasmussen ha estado siempre delante en el juego en ciertas formas: Algunas de las preguntas sobre el error humano ya fueron tomadas por él en décadas pasadas. Erik Hollnagel fue instrumental en contribuir a moldear las ideas en el capítulo 6, y Jim Nyce ha tenido una influencia significativa en el capítulo 9. Quiero agradecer también a mis estudiantes, particularmente a Arthur Dijkstra y Margareta Lutzhoft, por sus comentarios en borradores previos y sus útiles sugerencias. Margareta merece especial gratitud por su ayuda en decodificar el caso estudiado en el capítulo 5, y Arthur, por su habilidad para señalar “Ansiedad Cartesiana”, donde yo no la reconocí. Agradecimiento especial al editor de series Barry Kantowitz y al editor Bill Webber, por su confianza en el proyecto. El trabajo para este libro fue apoyado por una subvención del Swedish Flight Safety Directorate. Prefacio Los factores humanos en el transporte siempre han sido relacionados con el error humano. De hecho, como un campo de investigación científica, debe su inclusión a investigaciones de error de piloto y a la subsiguiente insatisfacción de los investigadores con la etiqueta. En 1947, Paul Fitts y Richard Jones, construyendo sobre trabajo pionero por Alphonse Chapanis, demostraron como características de las cabinas de
  • 143. aviones de la II Guerra Mundial influenciaban sistemáticamente la forma en que pilotos cometían errores. Por ejemplo, pilotos confundían las palancas de flaps y tren de aterrizaje porque éstas a menudo se veían y se sentían igual, y estaban ubicadas próximas una de otra (switches idénticos, o palancas muy similares). En el incidente típico, un piloto podía levantar el tren de aterrizaje en vez de los flaps, luego de un aterrizaje, con las previsibles consecuencias para hélices, motores y estructura. Como un inmediato arreglo de guerra, una rueda de goma fue adherida al control del tren de aterrizaje y un pequeño terminal con forma de cuña, al control del flan. Esto básicamente solucionó el problema y el arreglo de diseño eventualmente llegó a ser un requerimiento de certificación. Los pilotos podían también mezclar los controles de acelerador, mezcla y hélice, ya que sus ubicaciones cambiaban en diferentes cabinas. Tales errores no fueron sorprendentes, degradaciones aleatorias de desempeño humano. De preferencia, ellos fueron acciones y cálculos que tuvieron sentido una vez que los investigadores comprendieron las características del mundo en que las personas trabajaban, una vez que ellos hubieron analizado la situación que rodeaba al operador. Los errores humanos están sistemáticamente conectados a características de las herramientas y tareas de las personas. Puede ser difícil predecir cuándo o qué tan a menudo ocurrirán los errores (a pesar que las técnicas de fiabilidad humana ciertamente han intentado). Con una examinación crítica del sistema en que las personas trabajan, sin embargo, no es tan difícil anticipar dónde ocurrirán los errores. Factores Humanos ha utilizado esta premisa desde siempre: La noción de diseñar sistemas resistentes y tolerantes al error se basa en ello. Factores Humanos fue precedido por una era de hielo mental de comportamientismo, en que cualquier estudio de la mente era visto como ilegítimo y no científico. El comportamientismo en sí ha sido una psicología de protesta, acuñada en agudo contraste entre la introspección experimental de Wundt que la precedió. Si el comportamientismo fue una psicología de protesta, entonces factores humanos fue una psicología pragmática. La Segunda Guerra Mundial trajo tal furioso paso de desarrollo tecnológico que el comportamientismo fue encontrado manos abajo. Surgieron problemas prácticos en la vigilancia y toma de decisiones del operador que fueron totalmente inmunes al repertorio de exhortaciones motivacionales del comportamiento de Watson. Hasta ese punto, la psicología había asumido ampliamente que el mundo estaba arreglado, y que los humanos tenían que adaptarse a sus demandas a través de la selección y el entrenamiento. Factores Humanos mostró que el mundo no estaba arreglado: Cambios en el ambiente podrían fácilmente llevar a incrementos en el desempeño no alcanzables mediante intervenciones comportamientistas. En el comportamientismo, el rendimiento tenía que ser adaptado luego de las características del mundo. En factores humanos, características del mundo fueron adaptadas luego de los límites y capacidades del desempeño. Como una psicología de lo pragmático, factores humanos adoptó la visión de ciencia y método científico Cartesiano-Newtoniano (tal como Wundt y Watson habían hecho). Descartes y Newton fueron jugadores dominantes en la revolución científica del siglo XVII. Esta transformación total en el pensamiento instaló una creencia en la absoluta certeza del conocimiento científico, especialmente en la cultura occidental. El ánimo de la ciencia fue de alcanzar el
  • 144. control al derivar leyes de la naturaleza generales e idealmente matemáticas (tal como nosotros intentamos hacer para el desempeño de la persona y el sistema). Una herencia de esto puede ser vista todavía en factores humanos, particularmente en la predominación de experimentos, lo nomotético más que la inclinación ideográfica de su investigación y una fuerte fe en el realismo de los hechos observados. También puede ser reconocida en las estrategias reductivas con que se relacionan los factores humanos y la seguridad operacional de sistemas para lidiar con la complejidad. La solución de problemas Cartesiano-Newtoniana es analítica. Consiste en romper con los pensamientos y problemas en piezas y en arreglarlas en algún orden lógico. El fenómeno necesita ser descompuesto en partes más básicas y su totalidad puede ser explicada exhaustivamente haciendo referencia a sus componentes constituyentes y sus interacciones. En factores humanos y seguridad operacional de sistemas, se entiende mente como una construcción tipo caja, con un intercambio mecánico en representaciones internas; trabajo está separado en pasos procedimentales a través de análisis de tareas jerárquicos; las organizaciones no son orgánicas o dinámicas, sino que están constituidas por estratos estáticos y compartimientos y lazos; y seguridad operacional es una propiedad estructural que puede ser entendida en términos de sus mecanismos de orden más bajo (sistemas de reporte, tasas de error y auditorías, la función de la administración de seguridad operacional en el diagrama organizacional, y sistemas de calidad). Estas visiones están con nosotros hoy. Dominan el pensamiento en factores humanos y seguridad operacional de sistemas. El problema es que extensiones lineares de estas mismas nociones no pueden trasladarnos dentro del futuro. Las una vez pragmáticas ideas de factores humanos y seguridad de sistemas están cayendo detrás de los problemas prácticos que han comenzado a surgir en el mundo de hoy. Podríamos estar dentro para una repetición de los cambios que vinieron con los desarrollos tecnológicos de la II Guerra Mundial, donde el comportamientismo mostró quedar corto. Esta vez podría ser el caso de factores humanos y seguridad operacional de sistemas. Los desarrollos contemporáneos, sin embargo, no son solo técnicos. Hay sociotécnicos: La comprensión sobre qué hace a los sistemas seguros o frágiles requiere más que conocimiento sobre la interfase hombre-máquina. Como David Meister señaló recientemente (y el ha estado cerca por un tiempo), factores humanos no ha progresado mucho desde 1950. “Hemos tenido 50 años de investigación”, él se pregunta retóricamente, “¿pero cuanto más de lo que sabíamos en un principio, sabemos?” (Meister 2003, p. 5). No es que las propuestas tomadas por factores humanos y seguridad operacional de sistemas ya no sean útiles, sino que su utilidad sólo puede ser apreciada realmente cuando vemos sus límites. Este libro no es sino un capítulo en una transformación más larga que ha comenzado a identificar las profundamente enraizadas restricciones y los nuevos puntos de influencia en nuestras visiones de factores humanos y seguridad operacional de sistemas. Las 10 preguntas acerca del error humano no son solo preguntas sobre el error humano como un fenómeno, si es que lo son (y si el error humano es algo en y por sí mismo, en primer lugar). En realidad son preguntas acerca de factores humanos y seguridad operacional de sistemas como disciplinas, y en qué lugar se encuentran hoy. En formular estas preguntas acerca del error, y en trazar las respuestas a ellas, este libro intenta mostrar dónde nuestro pensamiento
  • 145. corriente está limitado; dónde nuestro vocabulario, nuestros modelos y nuestras ideas están limitando el progreso. En cada capítulo, el libro intenta entregar indicaciones para nuevas ideas y modelos que tal vez se las puedan arreglar mejor con la complejidad de los problemas que nos encaran ahora. Uno de esos problemas es que sistemas aparentemente seguros pueden desviarse y fallar. Desviarse en dirección a los márgenes de seguridad operacional ocurre bajo presiones de escasez y competencia. Está relacionado con la opacidad de sistemas socio técnicos grandes y complejos, y los patrones de información en que los integrantes basan sus decisiones y tratos. Derivaren fallas está asociado con los procesos organizacionales normales de adaptación. Las fallas organizacionales en sistemas seguros no están precedidas por fallas; lo están por el quiebre o carencia de calidad de componentes aislados. De hecho, la falla organizacional en sistemas seguros está precedida por trabajo normal, por personas normales haciendo trabajo normal en organizaciones aparentemente normales. Esto aparenta competir severamente con la definición de un incidente, y puede minar el valor de reportar incidentes como una herramienta para aprender más allá de un cierto nivel de seguridad operacional. El margen entre el trabajo normal y el incidente es claramente elástico y sujeto a revisión incremental. Con cada pequeño paso fuera de las normas previas, el éxito pasado puede ser tomado como una garantía de seguridad operacional futura. El incrementalismo mella el sistema completo crece de la línea de derrumbe, pero sin indicaciones empíricas poderosas de que está encaminado de esa forma. Modelos corrientes de factores humanos y seguridad operacional de sistemas no pueden lidiar con la derivación hacia fallas. Ellos requieren fallas como un prerrequisito para las fallas. Ellos aún están orientados hacia el encuentro de fallas (por ejemplo, errores humanos, hoyos en las capas de defensa, problemas latentes, deficiencias organizacionales y patógenos residentes), y se relacionan con niveles de trabajo y estructura dictados externamente, por sobre tomar cuentas internas (sobre qué es una falla vs. Trabajo normal) como cánones. Procesos de toma de sentido, de la creación de racionalidad local por quienes de verdad realizan los miles de pequeños y mayores tratados que transportan un sistema a lo largo de su curso de deriva, yacen fuera del léxico actual de factores humanos. Los modelos corrientes típicamente ven a las organizaciones como máquinas Newtonianas-cartesianas con componentes y nexos entre ellas. Los contratiempos son modelados como una secuencia de eventos (acciones y reacciones) entre un disparador y un resultado. Tales modelos no pueden pronunciarse acerca de la construcción de fallas latentes, ni sobre la gradual, incremental soltura o pérdida de control. Los procesos de erosión de las restricciones, de detrimentos de la seguridad operacional, de desviación hacia los márgenes no pueden ser capturados porque los enfoques estructurales son metáforas estáticas para formas resultantes, no modelos dinámicos orientados hacia procesos de formación. Newton y Descartes, con sus particulares estudios en ciencias naturales, tienen una firme atadura en factores humanos, seguridad operacional de sistemas y también en otras áreas. El paradigma de procesamiento de información, por ejemplo, tan útil para explicar tempranamente los problemas de transferencia de información entre radar y radio operadores en la II Guerra Mundial, solo ha colonizado la investigación de factores humanos. Aún es una fuerza dominante,
  • 146. reforzado por los experimentos del Spartan laboratory, que parecen confirmar su utilidad y validez. El paradigma tiene mente mecanizada, partida en componentes separados (por ejemplo, memoria de trabajo, memoria de corto plazo y memoria de largo plazo) con nexos entre medio. Newton habría amado su mecánica. A Descartes también le habría gustado: Una separación clara entre mente y mundo solucionaba (o circunvalada, más bien) una serie de problemas asociados con las transacciones entre ambos. Un modelo mecánico tal como procesamiento de información, claro que mantiene apego especial por la ingeniería y otros consumidores de los resultados de la investigación de factores humanos. Dictado pragmático salvando las diferencias entre práctica y ciencia, y el tener un modelo cognitivo similar a un aparato técnico familiar para gente aplicada, es una forma poderosa de hacer sólo eso. Pero no existe razón empírica para restringir nuestra comprensión de actitudes, memorias o heurísticos, como disposiciones codificadas mentalmente, como ciertos contenidos de conciencia con determinadas fechas de vencimiento. De hecho, tal modelo restringe severamente nuestra habilidad para comprender cómo las personas utilizan el habla y la acción para construir un orden perceptual y social; cómo, a través del discurso y la acción, las personas crean los ambientes que, a cambio, determinan la acción posterior y asesorías posibles, y que restringen lo que, en consecuencia, será visto como discurso aceptable o decisiones racionales. No podemos comenzar a entender la deriva en fallas, sin comprender cómo grupos de personas, a través de cálculo y acción, ensamblan versiones del mundo en las que ellos calculan y actúan. El procesamiento de la información cabe dentro de una perspectiva metateórica mayor y dominante, que toma al individuo como su foco central (Heft, 2001). Esta visión, también, es una herencia de la Revolución Científica, la que ha popularizado incrementadamente la idea humanista de un “individuo auto contenido”. Para la mayoría de la psicología, esto ha significado que todos los procesos dignos de estudio toman lugar dentro de los márgenes del cuerpo (o mente), algo epitomizado por el enfoque mentalista del procesamiento de información. En su incapacidad para tratar significativamente la deriva hacia la falla, que interconecta factores individuales, institucionales, sociales y técnicos, los factores humanos y la seguridad operacional de sistemas están actualmente pagando por su exclusión teórica de los procesos sociales y transaccionales, entre los individuos y el mundo. El componencialismo y la fragmentación de la investigación de factores humanos aún es un obstáculo al progreso en este sentido. Un estiramiento de la unidad de análisis (como lo hecho en las ideas de ingeniería de sistemas cognitivos y cognición distribuida), y una llamada actuar centralmente en comprender los cálculos y el pensamiento, han sido formas de lidiar con los nuevos desarrollos prácticos para los que los factores humanos y la seguridad de sistemas, no estaban preparados. El énfasis individualista del protestantismo y la iluminación, también reboza de ideas sobre control y culpa. ¿Debemos culpar a las personas por sus errores? Los sistemas sociotécnicos han crecido en complejidad y tamaño, moviendo a algunos a decir que no tiene sentido esperar o demandar de los integrantes (ingenieros, administradores, operadores), que giren en torno a algún ideal moral reflectivo. Presiones de escasez y competencia, han logrado convertirse insidiosamente en mandatos organizacionales e individuales, los que a cambio, restringen severamente la racionalidad y opciones (y por ende autonomía), de
  • 147. todos los actores en el interior. Ya sólo los antihéroes continúan teniendo roles líderes en nuestras historias de fallas. El individualismo aún es crucial para la propia identidad en la modernidad. La idea que lleva a un trabajo de equipo, a una organización entera, o a toda una industria a quebrar un sistema (como se ilustró mediante casos de deriva en fallas) es muy poco convencional respecto de nuestras preconcepciones culturales heredadas. Incluso antes que llegáramos a episodios complejos de acción y responsabilidad, podemos reconocer la prominencia de la de-construcción y componentalismo Newtoniano-Cartesianos, en mucha investigación de factores humanos. Por ejemplo: Las nociones empíricas de una percepción de elementos que gradualmente se fueron convirtiendo en significado, a través de etapas de procesamiento mental, son nociones teóricas legítimas hoy. El empirismo fue otrora una fuerza en la historia de la psicología. Incluso atascado por el paradigma de procesamiento de información, sus principios centrales han retrocedido, por ejemplo, en teorías de conciencia situacional. En adoptar un modelo cultural como tal, desde una comunidad aplicada y sometiéndolo a escrutinio científico putativo, por supuesto que los factores humanos encuentran su ideal pragmático. Los modelos culturales abarcan los problemas de factores humanos como una disciplina aplicada. Pocas teorías pueden cubrir el abismo entre investigador y practicantes mejor que aquellas que aplican y disectan los vernáculos practicantes para estudio científico. Pero los modelos culturales vienen con una etiqueta de precio epistemiológica. La investigación que se adjudica indagar un fenómeno (digamos, conciencia situacional dividida, o complacencia), pero que no definen ese fenómeno (porque, como modelo cultural, se supone que todos saben lo que significa) no puede falsear el contacto con la realidad empírica. Ello deja a tal investigador de factores humanos sin el mecanismo mayor de control científico desde Kart Popper. Conectado al procesamiento de información, y al enfoque experimental a muchos problemas de factores humanos, está un prejuicio cuantitativo, primero competido en la psicología por Wilhelm Wundt, en su laboratorio en Leipzig. A pesar de que Wundt rápidamente tuvo que admitir que una cronometría de la mente era una meta muy audaz de la investigación, los proyectos experimentales de investigación sobre factores humanos aún pueden reflejar versiones pálidas de su ambición. Contar, medir, categorizar y analizar estadísticamente, son herramientas gobernantes del tratado, mientras que las investigaciones cualitativas son a menudo desechadas por subjetivas y no científicas. Los factores humanos tienen una orientación realista, creyendo que los hechos empíricos son aspectos estables y objetivos de la realidad que existe independiente del observador o su teoría. Nada de esto hace menos reales los hechos generados mediante experimentos, para aquellos que observan, publican, o leen acerca de ellos. Sin embargo, al apreciar a Thomas Kuhn (1962), esta realidad debe ser vista por lo que es: un acuerdo negociado implícitamente entre investigadores de pensamientos similares, más que un común denominador accesible a todos. No hay arbitrio final aquí. Es posible que un enfoque experimental, componencial, pueda disfrutar de un privilegio epistemiológico. Pero ello tambien significa que no hay imperativo automático para únicamente sostenerse por la investigación legítima, como se ve a veces en la corriente principal de factores humanos. Las formas de obtener acceso a la realidad
  • 148. emírica son infinitamente negociables, y su aceptación es una función de qué tan bien ellos dan conformidad a la visión mundial de aquellos a quienes el investigador apela. La persistente supremacía cuantitativista (particularmente en los factores humanos norteamericanos), se ve apesadumbrada con este tipo de autoridad consensuada (debe ser bueno porque todos lo están haciendo). Tal histéresis metodológica podría tener que ver más con los miedos primarios de ser marcado “no científico” (los miedos compartidos por Wundt y Watson) que con un retorno estable de incrementos significativos de conocimiento generados por la investigación. El cambio tecnológico dio impulso a los pensamientos de factores humanos y seguridad de sistemas. Las demandas prácticas puestas por los cambios tecnológicos envolvieron a los factores humanos y la seguridad de sistemas con el espíritu pragmático que hasta hoy tienen. Pero lo pragmático no es más pragmático si no encaja con las demandas creadas por aquello que está sucediendo ahora en nuestro alrededor. El paso del cambio sociotecnológico no tiende a desacelerar pronto. Si creemos que la II Guerra Mundial generó una gran cantidad de cambios interesantes, dando a luz a los factores humanos como una disciplina, entonces podríamos estar viviendo en tiempos incluso más excitantes hoy. Si nosotros nos mantenemos haciendo lo que hemos estado realizando en factores humanos y seguridad de sistemas, simplemente porque nos ha funcionado en el pasado, podríamos llegar a ser uno de esos sistemas que derivan hacia la falla. Lo pragmático requiere que nosotros nos adaptemos también, para arreglárnoslas mejor con la complejidad del mundo que nos enfrenta hoy. Nuestros éxitos pasados no son garantía de continuar logros futuros. Prólogo de la serie. Barry H. Kantowitz Battelle Human Factors Transportation Center El rubro del transporte es importante, por razones tanto prácticas como teóricas. Todos nosotros somos usuarios de sistemas de transporte como operadores, pasajeros y consumidores. Desde un punto de vista científico, el rubro del transporte ofrece una oportunidad de crear y probar modelos sofisticados de comportamiento y cognición humanos. Esta serie cubre los aspectos práctico y teórico de los factores humanos en el transporte, con un énfasis en su interacción. La serie es interpretada como un foro para investigadores e ingenieros interesadoes en cómo funcionan las personas dentro de sistemas de transporte. Todos los modos de transporte son relevantes, y todos los esfuerzos en factores humanos y ergonomía que tienen implicancias explícitas para los sistemas de transporte caen en una visión pobre en serie. Esfuerzos analíticos son importantes para relacionar teoría y datos. El nivel de análisis puede ser tan pequeño como una persona, o de espectro internacional. Los datos empíricos pueden provenir de un amplio rango de metodologías, incluyendo investigación de laboratorio, estudios simulados, seguimiento de pruebas, pruebas operacionales, trabajo en el campo, revisiones de diseños, o peritajes. Este amplio espectro es interpretado para maximizar la utilidad de la serie para lectores con trasfondos distintos.
  • 149. Espero que la serie sea útil para profesionales en las disciplinas de factores humanos, ergonomía, ingeniería de transportes, psicología experimental, ciencia cognitiva, sociología e ingeniería de seguridad operacional. Está orientada a la apreciación de especialistas de transporte en la industria, gobierno, o académicos, así como también, al investigador en busca de una base de pruebas para nuevas ideas acerca de la interfase entre las personas y sistemas complejos. Este libro, mientras se enfoca en el error humano, ofrece una visión de sistemaparticularmente bienvenida en los factores humanos del transporte. Una meta mayor de esta serie de libros es relacionar la teoría y la práctica de factores humanos. El autor es encomendado para formular preguntas que no sólo realacionan teoría y práctica, sino que fuerzan al lector a evaluar las clases de teoría como las aplicadas a factores humanos. Los enfoques de información tradicionales, derivados del modelo de canal limitado que ha formado las bases originales para el trabajo teórico en factores humanos, son escrutados. Enfoques más nuevos, tales como la conciencia situacional, que procedía de deficiencias en el modelo de teoría de la información, son criticados por tratarse solo de modelos culturales carentes de rigor científico. Espero que este libro engendre un vigoroso debate sobre qué clases de teoría sirven mejor a la ciencia de factores humanos. Si bien, las diez preguntas ofrecidas aquí forman una base para debate, existen más de diez respuestas posibles. Los libros posteriores en esta serie, continuarán buscando estas respuestas mediante la entrega de perspectivas prácticas y teóricas en los factores humanos en el transporte. Nota del Autor. Sidney Dekker es profesor de Factores Humanos en la Universidad Lund, Suecia. El recibió un M.A. en psicología organizacional de la University of Nijmegen y un M.A. en psicología experimental de la Leiden University, ambas en Noruega. El ganó su Ph.D. en Ingeniería de Sistemas Cognitivos de la Ohio State University. Ha trabajado previamente para el Public Transport Cooperation in Melbourne, Australia; la Massey University School of Aviation, Nueva Zelanda; y la British Aerospace. Sus especialidades e intereses investigativos son el error humano, investigación de accidentes, estudios de campo, diseño representativo y, automatización. Ha tenido alguna experiencia como piloto, entrenado en material DC-9 y Airbus A340. Sus libros previos incluyen The Field Guide to Human Error Investigations (2002). Capítulo 1. ¿Fue Falla Mecánica o Error Humano? Estos son tiempos excitants y competitivos para factores humanos y seguridad operacional de sistemas. Y existen indicaciones sobre que no estaremos completamente bien equipados para ellos. Hay un reconocimiento creciente que los accidentes (un accidente de avión comercial, un desastre de un transbordador espacial) están intrincadamente ligados al funcionamiento de organizaciones e instituciones aledañas. La operación de aviones de aerolíneas comerciales o transbordadores espaciales o traslados de pasajeros,
  • 150. engendra vastas redes de organizaciones de apoyo, de mejoramiento y avance, de control y regulación. Tecnologías complejas no pueden existir sin estas organizaciones e instituciones – transportadores, reguladores, agencias de gobierno, fabricantes, subcontratistas, instalaciones de mantenimiento, grupos de entrenamiento – que, en principio, están diseñadas para proteger y dar seguridad a su operación. Su mandato real se orienta a no tener accidentes. Desde el accidente nuclear en Three Mile Island, en 1978, sin embargo, las personas se percatan en mayor medida que las mismas organizaciones destinadas a mantener una tecnología segura y estable (operadores humanos, reguladores, la administración, el mantenimiento), están en realidad entre los mayore contribuyentes al quiebre. Las fallas socio- tecnológicas son imposibles sin tales contribuciones. En desmedro de este reconocimiento creciente, factores humanos y seguridad operacional de sistemas dependen de un vocabulario basado en una concepción particular de las ciencias naturales, derivada de sus raíces en la ingeniería y en la psicología experimental. Este vocabulario, el uso sutil de metáforas, imágenes e ideas, está más y más de acuerdo con las demandas interpretativas puestas por los accidentes organizacionales modernos. El vocabulario expresa una visión mundial (tal vez), apropiada para las fallas técnicas, pero incapaz de abrazar y penetrar las áreas relevantes de fallas socio-técnicas – esas fallas que incorporan los efectos interconectados de la tecnología y de la complejidad social organizada circundando su uso. Lo que significa, más fallas hoy. Cualquier lenguaje, y la visión mundial que lo acompaña, impone limitaciones en nuestro entendimiento de la falla. Sin embargo, estas limitaciones ahora están volviéndose incrementadamente evidentes y presionantes. Con el crecimiento en el tamaño y la complejidad del sistema, la naturaleza de los accidentes está cambiando (accidentes de sistemas, fallas sociotécnicas). La escasez y competitividad por recursos significa que los sistemas presionan incrementadamente sus operaciones hacia los bordes de sus coberturas de seguridad. Ellos tienen que hacerlo para permanecer exitosos en sus ambientes dinámicos. Los retornos comerciales al estar en los límites son mayores, pero las diferencias entre tener y no tener un accidente están caóticamente superando los márgenes disponibles. Los sistemas abiertos son remolcados continuamente hacia dentro de sus áreas de seguridad operacional, y los procesos que impulsan tal migración no son sencillos de reconocer o controlar, como tampoco la ubicación exacta de los márgenes. Los sistemas grandes, complejos, se ven capaces de adquirir una histéresis, una oscura voluntad propia, en la que derivan hacia mayor elasticidad o hacia los bordes de la falla. Al mismo tiempo, el veloz avance de los cambios tecnológicos crea nuevos tipos de peligros, especialmente aquellos que vienen con mayor dependencia en la tecnología computacional. Ambos sistemas, social e ingeniero (y su interrelación), se relacionan a un siempre mayor volumen de tecnología de información. A pesar de nuestra velocidad computacional y de que el acceso a la información pudiera parecer una ventaja de seguridad operacional en principio, nuestra habilidad de tomar conciencia de la información no está manteniendo el paso con nuestra habilidad para recolectarla y generarla. Al conocer más, puede que en realidad conozcamos mucho menos. Administrar la seguridad operacional en base a números (incidentes, conteos de error, amenazas a la seguridad operacional), como si la
  • 151. seguridad operacional fuera sólo otro indicador de un modelo de negocios de Harvard, puede crear una falsa impresión de racionalidad y control administrativo. Puede ignorar variables de orden más alto que pueden develarla verdadera naturaleza y dirección de la deriva del sistema. Podría venir además, al costo de comprensiones más profundas del funcionamiento socio-técnico real. DECONSTRUCCION, DUALISMO y ESTRUCTURALISMO. ¿Entonces qué es este idioma y la visión mundial técnica obsoleta que representa? Las características que lo definen son la deconstrucción, el dualismo y el estructuralismo. Deconstrucción significa que el funcionamiento de un sistema puede ser comprendido exhaustivamente al estudiar la distribución y la interacción de sus partes constituyentes. Científicos e ingenieros típicamente miran al mundo de esta forma. Las investigaciones de accidentes también deconstruyen. Para determinar la falla mecánica, o para ocasionar las partes dañinas, los investigadores de accidentes hablan de “ingeniería reversa”. Ellos recuperan partes de los restos y las reconstruyen en un todo nuevamente, a menudo literalmente. Pensemos en el TWA800 Boeing 747 que explotó en el aire luego del despegue desde el aeropuerto Kennedy de Nueva York, en 1998. Fue recuperado desde el fondo del Océano Atlántico y dolorosamente rearmado, en un hangar. Con el rompecabezas lo más completo como posible, las partes dañadas debieron eventualmente quedar expuestas, permitiendo a los investigadores identificar la fuente de la explosión. Pero continúa desafiando al sentido, continúa siendo un rompecabezas sólo cuando el funcionamiento (o no funcionamiento) de sus partes falla al explicar el todo. La parte que causó la explosión, que la inició, nunca fue identificada en verdad. Esto es lo que hace escalofriante la investigación del TWA800. En desmedro de una de las más caras reconstrucciones en la historia, las partes reconstruidas rechazaron contar por el comportamiento del todo. En un caso como tal, una comprensión atemorizante, incierta, da escalofríos a los cuerpos de investigación y a la industria. Un todo falló sin una parte fallada. Un accidente ocurrió sin una causa; no hay causa – nada que reparar, nada que reparar – y podría suceder mañana nuevamente, u hoy. La segunda característica definitoria es el dualismo. Dualismo significa que existe una separación distintiva entre causa humanan y material – entre el error humano y la falla mecánica --. Para ser un buen dualista usted, por supuesto, tiene que deconstruir: Usted tiene ue desconectar las contribuciones humanas de las contribuciones mecánicas. Las reglas de la Organización de Aviación Civil Internacional, que gobierna a los investigadores de accidentes aéreos lo determina expresamente. Ellos fuerzan a los investigadores de accidentes a separar las contribuciones humanas de las mecánicas. Parámetros específicos en los reportes de accidentes están reservados para el seguimiento de los componentes humanos potencialmente dañados. Los investigadores exploran el historial de las 24 y 72 horas previas de los humanos que más tarde se cerían involucrados en un accidente. ¿Hubo alcohol? ¿Hubo estrés? ¿Hubo fatiga? ¿Hubo falta de eficiencia o experiencia? ¿Hubo problemas previos en los registros de entrenamiento u operacionales de estas personas? ¿Cuántas horas de vuelo tenía verdaderamente el piloto? ¿Hubo otras distracciones o problemas? Este requisito investigativo refleja una interpretación primitiva de los factores humanos, una tradición aeromédica en que el error humano está
  • 152. reducido a la noción de “estar en forma para el servicio”. Esta noción ha sido sobrepasada hace tiempo por los desarrollos en factores humanos hacia el estudio de personas normales realizando trabajos normales en lugares de trabajo normales (más que en individuos deficientes mental o fisiológicamente), pero el modelo aeromédico sobre extendido es retenido como una clase de práctica conformista positivista, dualista y deconstructiva. En el paradigma de estar en forma para el servicio, las fuentes de error humano debieron ser buscadas en las horas, días o años previos al accidente, cuando el componente humano estaba torcido, debilitado y listo para el quiebre. Encuentre la parte del humano que estaba perdida o deficiente, la “parte desajustada”, y la parte humana acarreará la carga interpretativa del accidente. Indague en la historia reciente, encuentre las piezas deficientes y arme el rompecabezas: deconstrucción, reconstrucción y dualismo. La tercera característica definitoria de la visión mundial técnica que aún gobierna nuestro entendimiento de éxito y fallas en en sistemas complejos es el estructuralismo. El idioma que utilizamos para describir los trabajadores internos de sistemas de éxito y fallas es un idioma de estructuras. Hablamos de capas de defensa, de agujeros en estas capas. Identificamos los “bordes suaves” y los “bordes agudos” de organizaciones e intentamos capturar como una tiene efectos sobre la otra. Incluso la cultura de seguridad es tratada como una estructura edificada por otros bloques. Qué tanta cultura de seguridad tenga una organización depende de kas rytubas y componentes que tenga para el reporte de incidentes (esto es mesurable), de hasta qué punto es justa con los operadores que cometen errores (esto es más difícil de medir, pero todavía posible), y de qué relación tiene entre sus funciones de seguridad y otras estructuras institucionales. Una realidad social profundamente compleja está por ende, reducida a un limitado número de componentes mesurables. Por ejemplo ¿tiene el departamento de seguridad una ruta directa a la administración más alta? ¿Cómo es esta tasa de reportes comparada a otras compañías? Nuestro idioma de fallas también es un idioma de mecánica. Describimos trayectorias de accidents, buscamos causas y efectos, e interacciones. Buscamos fallas iniciadoras, o eventos gatilladores, y seguimos el colapso del sistema estilo dominó, que le sigue. Esta visión mundial ve a los sistemas socio-técnicos como máquinas con partes en una distribución particular (bordes agudos vs. suaves, capas de defensa), con interacciones particulares (trayectorias, efectos dominó, gatillos, iniciadores), y una mezcla de variables independientes o intervinientes (cultura de la culpa vs. cultura de seguridad). Esta es la visión mundial heredada de Descartes y Newton, la visión mundial que ha impulsado exitosamente el desarrollo tecnológico desde la revolución científica hace medio milenio. La visión mundial, y el idioma que produce, está basada en nociones particulares de ciencias naturales, y ejerce una sutil pero muy poderosa influencia en nuestra comprensión del éxito y falla socio tecnológicos hoy. Así como ocurre con mucha de la ciencia y pensamiento occidentales, perdura y dirige la orientación de factores humanos y seguridad de sistemas. Incluso el idioma, si se utiliza irreflexivamente, se vuelve fácilmente aprisionante. El idioma expresa, pero también determina qué podemos ver y cómo lo vemos.
  • 153. El idioma constriñe como construimos la realidad. Si nuestras metáforas nos dieran el coraje para modelar cadenas de accidentes, entonces comenzaremos nuestra investigación buscano eventos que encajen en esa cadena. ¿Pero qué eventos deben ir adentro? ¿Dónde debemos comenzar? Como Nancy Leveson (2002) señaló, la elección de cuales eventos poner dentro es arbitraria, así como la extensión, el punto de partida y el nivel de detalle de la cadena de eventos. Qué, preguntó ella, justifica asumir que los eventos inciales son mutuamente exclusivos, excepto que ello simplifica las matemáticas del modelo de la falla? Estos aspectos de la tecnología y de su operación, encumbran preguntas sobre lo apropiado del modelo dualista, deconstructivo, estructuralista que domina factores humanos y seguridad de sistemas. En su lugar, podríamos buscar una visión de sistemas real, que no solo apunte las deficiencias estructurales detrás de los errores humanos individuales (si se necesita ello lo puede hacer), pero que aprecia la adaptabilidad orgánica, ecológica, de sistemas sociotécnicos complejos. Buscando fallas para explicar fallas. Nuestras creencias y credos más arraigados a menudo permanecen encerrados en la más simple pregunta.La pregunta acerca de si el error humano o la falla mecánica es uno de ellos. ¿Fue el accidente causado por falla mecánica o por error humano? Es una pregunta existencial para las repercusiones posteriores de un accidente. Más aún, se ve como una pregunta tan simple e inocente. Para muchos es una consulta normal de preguntar: Si has tenido un accidente, tiene sentido averiguar que falló. La pregunta, sin embargo, envuelve una comprensión particular de cómo los accidentes ocurren, y sus riesgos confinando nuestro análisis causal de esa comprensión. Nos presenta en un repertorio interpretativo arreglado. Escapar de este repertorio puede ser difícil. Fija las preguntas que hacemos, entrega las cabezas que perseguimos y las claves que examinamos, y determina las conclusiones que eventualmente esbozaremos. ¿Qué componentes estaban dañados? ¿Fue algo maquinario o algo humano? ¿Por cuanto tiempo había estado torcido el componente o, de otra forma, deficiente? ¿Por qué se quebró eventualmente? ¿Cuáles fueron los factores latentes que conspiraron en su contra? ¿Qué defensas hubieron erosionado? Estos son los tipos de preguntas que dominan las investigaciones en factores humanos y seguridad de sistemas hoy en día. Organizamos reportes de accidentes y nuestro discurso sobre accidentes alrededor de la lucha por respuestas a ellos. Las investigaciones dan vuelta los componentes mecánicos dañados (un perno dañado en el trim vertical de un MD-80 de Alaska Airlines, azulejos refractantes de calor perforados en el transbordador espacial Columbia), componentes de baja performance humana (por ejemplo, quiebres en C.R.M., un piloto que tiene un accidentado historial de entrenamiento), y grietas en las organizaciones responsables por el rodaje del sistema (por ejemplo, cadenas de decisión organizacional débiles). El buscar fallas – humanas, mecánicas, u organizacionales – para explicar fallas es tan de sentido común que la mayoría de los investigadores nunca se detiene a pensar si estas son en realidad las pistas correctas que perseguir. Que la falla está causada por falla es pre racional – no la consideramos conscientemente más
  • 154. como una pregunta en las decisiones que hacemos acerca de dónde mirar y qué concluir. Aquí hay un ejemplo. Un bimotor Douglas DC-9-82 aterrizó en un aeropuerto regional en las Tierras Altas del Sur de Suecia en el verano de 1999. Chubascos de lluvia habían pasado a través del area más temprano, y la pista estaba aún húmeda. Durante la aproximación a la pista, la aeronave recibió un ligero viento de cola, y después del toque a tierra, la tripulación tuvo problemas disminuyendo la velocidad. A pesar de los esfuerzos de la tripulación por frenar, el jet recorrió la pista y terminó en un campo a unos pocos cientos de pies del umbral. Los 119 pasajeros y la tripulación abordo resultaron ilesos. Luego de su parálisis, uno de los pilotos salió de la aeronave para chequear los frenos. Estaban fríos. No había ocurrido ninguna acción de freno. ¿Cómo pudo haber ocurrido esto? Los investigadores no encontraron fallas mecánicas en la aeronave. Los sistemas de freno estaban bien. En vez de ello, a medida que la secuencia de eventos fue rebobinada en el tiempo, los investigadores se percataron que la tripulación no había armado los ground spoilers de la aeronave antes del aterrizaje. Los ground spoilers ayudan a un jet a frenar durante la carrera, pero requieren ser armados antes de que puedan hacer su trabajo. Armarlos es trabajo de los pilotos, y es un ítem de la lista de chequeo before-landing y parte de los procedimientos en que ambos miembros de la tripulación están envueltos. En este caso, los pilotos olvidaron armar los spoilers. “Error de piloto”, concluyó la investigación. O realmente, ellos lo llamaron “Desmoronamientos en CRM (Crew Resource Management)” (Statens Haverikommision, 2000, p.12), una forma más moderna, más eufemista de decir “error de piloto”. Los pilotos no coordinaron lo que debían hacer; por alguna razon ellos fallaron en comunicar la configuración requerida de su aeronave. Además, después del aterrizaje, uno de los miembros de la tripulación no había dicho “¡Spoilers!”, como lo dicta el procedimiento. Esto pudo o debió alertar a la tripulación sobre la situación, pero ello no ocurrió. Los errores humanos habían sido encontrados. La investigación estaba concluida. “Error humano” es nuestra elección por defecto cuando no encontramos fallas mecánicas. Es una elección forzada, inevitable, que se calza suficientemente bien en una ecuación, donde el error humano es el inverso al monto de la falla mecánica. La ecuación 1 muestra como determinamos la proporción de responsabilidad causal: Error humano = f(1 – falla mecánica) (1) Si no existe falla mecánica, entonces sabemos qué comenzar a buscar en reemplazo. En este caso, no hubo falla mecánica. La ecuación 1 viene como una función de 1 menos 0. La contribución humana fue 1. Fue error humano, un quiebre de CRM. Los investigadores encontraron que los dos pilotos a bordo del MD-80 eran ambos capitanes, y no un capitán y un copiloto, como es usual. Fue una simple coincidencia de una planificación no completamente inusual, un ajuste elástico volar a bordo de esa aeronave desde esa mañana. Con dos capitanes en un barco, las responsabilidades arriesgan ser divididas inestable e incoherentemente. La división de responsabilidades fácilmente conduce a su abdicación. Si es función del copiloto verificar que los spoilers estén armados, y no hay copiloto, el riesgo es obvio. La tripulación estaba en algún sentido “desajustada”, o a lo
  • 155. menos, propensa al desmoronamiento. Así fue (hubo un “desmoronamiento de CRM”). ¿Pero qué explica esto? Estos son procesos que ellos mismos requieren una explicación, y pueden ser guías que se enfríen de todas formas. Tal vez hay una realidad mucho más profunda, acechando entre las primeras acciones particulares de un accidente como tal, una realidad en donde las causas humanas y mecánicas están interconectadas de forma mucho más profunda, que en nuestros enfoques formulaicos que nos permiten comprender investigaciones. Para vislumbrar mejor esta realidad, primero tenemos que girar hacia el dualismo. Es el dualismo que descansa en el corazón de la elección entre error humano y falla mecánica. Echamos un breve vistazo a su pasado y lo confrontamos con el encuentro empírico inestable, incierto, de un caso de spoilers desarmados. La miseria del dualismo. La urgencia de separar la causa humana de la causa mecánica es algu que debe haber encrucijado incluso a los pioneros en factores humanos. Pensar en el enredo con las cabinas de la II Guerra Mundial, que tenían switches de control idénticos para una diversidad de funciones. ¿Pudieron evitar, una aleta tipo flap en el control del flap y un mecanismo con forma de rueda en el elevador del tren, la confusión típica entre ambos? En ambos casos, el sentido común y la experiencia dice “sí”. Al cambiar algo en el munco, los ingenieros en factores humanos (suponiendo que ellos ya existían) cambiaron algo en el humano. Al jugar con el hardware con que las personas trabajaban, ellos cambiaron el potencial de las acciones correctas e incorrectas, pero sólo el potencial. Porque incluso con palancas de control con formas funcionales, algunos pilotos, en algunos casos, todvía las mezclaban. Al mismo tiempo, los pilotos no siempre mezclaban switches idénticos. Similarmente, no todas las tripulaciones que constan de dos capitanes fallan al armar los spoilers antes del aterrizaje. El error humano, en otras palabras, está suspendido, inestable, en algún lado entre las interfases humanas y mecánicas. El error no es completamente humano, ni completamente mecánico. Al mismo tiempo, “fallas” mecánicas (proveer switches idénticos ubicados próximos uno del otro) tienen que expresarse ellos mismos en la acción humana. Así que, si ocurre una confusión entre flaps y tren, entonces ¿cuál es la causa? ¿error humano o falla mecánica? Usted necesita ambos para tener éxito; necesita que ambos fallen. Donde termina uno y comienza el otro ya no está claro. Una idea del trabajo temprano en factores humanos era que el componente mecánico y la acción humana estaban interconectadas en formas que resisten el desarreglo dualista, deconstruido, eficiente, preferido aún hoy por los investigadores (y sus consumidores). DUALISMO Y REVOLUCIÓN CIENTÍFICA. La elección entre causa humana y causa material no es un simple producto de la investigación de accidentes o la ingeniería en factores humanos recientes. La elección se encuentra firmemente arraigada a la visión mundial Newtoniana-Cartesiana que gobierna mucho de nuestro pensamiento hoy en día, particularmente en profesiones dominadas por la tecnología como la ingeniería de factores humanos y la investigación de accidentes.
  • 156. Isaac Newton y Rene Descartes fueron dos de las figuras cúspide en la Revolución Científica entre 1500 y 1700 D.C. quienes produjeron un cambio dramático en la visión mundial, así como también, cambios profundos en el conocimiento y en las ideas de cómo adquirir y probar el conocimiento. Descartes propuso una aguda distinción entre lo que llamó res cogitans, el dominio de la mente, y res extensa, el dominio del problema. Aunque Descartes admitió alguna interacción entre los dos, insistió que el fenómeno mental y físico no puede ser entendido haciendo referencia al otro. Los problemas que ocurren en cualquiera de los dominios requieren enfoques completamente separados y diferentes conceptos para resolverlos. La noción de mundos mentales y materiales separados llegó a ser conocida como dualismo y sus implicancias pueden ser reconocidas en mucho de lo que pensamos y hacemos hoy en día. De acuerdo a Descartes, la mente está fuera del orden físico del problema y en ninguna forma es derivada de él. La elección entre error humano y falla mecánica, es tal elección dualista: De acuerdo a la lógica Cartesiana, el error humano no puede derivar de cosas materiales. Como veremos, esta lógica no se sustenta bien – de hecho, en una inspección más cercana, todo el campo de factores humanos está basado en esta afirmación. Separar el cuerpo del alma, y subordinar el cuerpo al alma, no sólo mantuvo a Descartes guera de problemas con la Iglesia. Su dualismo, su división entre mente y problema, agregó un importante problema filosófico que tuvo el potencial de sostener el progreso científico, tecnológico y social: ¿Cuál es el nexo entre mente y problema, entre el alma y el mundo material? ¿Cómo podríamos, como humanos, tomar el control y rehacer nuestro mundo físico lo suficiente para que estuviera aleado indivisiblemente, o incluso fuera sinónimo, con un alma irreductible, eterna? Una de las mayores aspiraciones durante la Revolución Científica de los siglos XVI y XVII fue el ver y comprender (y llegar a tener la capacidad de manipular) el mundo material como una máquina controlable, predictible, programable. Esto lo requirió para ser visto como nada más que una máquina: Sin vida, sin espíritu, sin alma, sin eternidad, sin inmaterialismo, sin impredictibilidad. La res extensa de Descartes, o mundo material, respondió sólo a esa inquietud. La res extensa fue descrita según el trabajar como una máquina, seguir reglas mecánicas y permitir explicaciones en términos de arreglo y movimiento de sus partes constituyentes. El progreso científico llegó a ser más fácil a causa de lo que excluyó. Lo que requirió la Revolución Científica, fue provisto por la desunión de Descartes. La naturaleza se volvió una máquina perfecta, gobernada por leyes matemáticas que fueron aumentando dentro de la comprensión del entendimiento y control humanos, y lejos de las cosas que los seres humanos no pueden controlar. Newton, por supuesto, es el padre de muchas de las leyes que aún gobiernan nuestro entendimiento y universo hoy en día. Su tercera ley de movimiento, por ejemplo, descansa en las bases de nuestras presunciones sobre causa y efecto, y causas de accidentes: Para cada acción existe una reacción igual y opuesta. En otras palabras, para cada causa existe un efecto equivalente, o más bien, para cada efecto, tiene que haber una causa equivalente. Una ley como tal, si bien sea aplicable a la liberación y transferencia de energía en sistemas mecánicos, está erróneamente enfocada al ser aplicada a fallas sociotécnicas, cuando las pequeñas vanalidades y sutilezas del trabajo normal hecho por gente normal en organizaciones normales puede degenerar
  • 157. lentamente en desastres enormes, en liberaciones de energía desproporcionadamente altas. La equivalencia de causa-consecuencia dictada por la tercera ley del movimiento de Newton, es bastante inapropiada como modelo de accidentes organizacionales. Adquirir control sobre un mundo material fue de crítica importancia para las personas hace quinientos años. La tierra de inspiración y fertilidad para las ideas de Descartes y Newton puede ser entenderse en el contraste de su tiempo. Europa estaba emergiendo de la Edad Media –tiempos de temor y fe, donde los lapsos de vida eran segados tempranamente por guerras, enfermedad y epidemias. No deberíamos subestimar la ansiedad y aprensión sobre la habilidad humana de enfocar sus esfuerzos contra estas míticas profecías. Luego de la Plaga, a los habitantes de la Inglaterra nativa de Newton, por ejemplo, les tomó hasta 1650 recuperar el nivel de 1300. La gente estaba a merced de fuerzas no apenas controlables y comprendidas como enfermedades. En el milenio precedente, la piedad, la oración y la penitencia estaban entre los mecanismos directivos mediante los cuales la gente podía alcanzar alguna clase de dominio sobre el mal y el desastre. El crecimiento de la perspicacia producido por la Revolución Científica, lentamente comenzó a entregar una alternativa, con éxito mensurable empíricamente. La Revolución Científica entregó nuevos medios para controlar el mundo natural. Los telescopios y microscopios le dieron a la gente nuevas formas de estudiar componentes que hasta entonces habían sido muy pequeños o habían estado muy distantes para ser vistos por el ojo desnudo, abriendo de pronto una visión del universo completamente nueva y por primera vez, revelando causas de los fenómenos hasta entonces malamente comprendidos. La naturaleza no fue un monolito atemorizante, inexpugnable, y las personas dejaron de estar sólo en el final de sus victimarios caprichos. Al estudiarla de nuevas formas, con nuevos instrumentos, la naturaleza podría ser descompuesta, partida en trozos más pequeños, medida y, a través de todo eso, comprendida mejor y eventualmente controlada. Los avances en las matemáticas (geometría, álgebra, cálculo), generaron modelos que pudieron contar para y predecir fenómenos recientemente descubiertos en, por ejemplo, medicina y astronomía. Al descubrir algunos de los cimientos del universo y la vida, y al desarrollar matemáticas que imitan su funcionamiento, la Revolución Científica reintrodujo un sentido de predicibilidad y control que hacía tiempo yacía durmiendo durante la Edad Media. Los seres humanos pudieron alcanzar el dominio y la preeminencia sobre las vicisitudes e imprevisibilidades de la naturaleza. La ruta hacia tal progreso debería venir de medir, derribar (conocido variadamente hoy como reducir, descomponer o deconstruir) y modelar matemáticamente el mundo a nuestro alrededor – para seguidamente reconstruirlo en nuestros términos. La mesurabilidad y el control son temas que animaron a la Revolución Científica, y resuenan fuertemente hoy en día. Incluso las nociones de dualismo (los mundos material y mental se encuentran separados) y la deconstrucción (los “todos” pueden ser explicados por el arreglo y la interacción de sus partes constituyentes a bajo nivel) han sobrevivido largamente a sus iniciadores. La influencia de Descartes se juzga tan grande en parte debido a que él la escribió en su lengua nativa, más que en Latin, presumiéndose por lo tanto que amplió el acceso y la exposición popular a sus pensamientos. La
  • 158. mecanización de la naturaleza desparramada por su dualismo, y los enormes avances matemáticos de Newton y otros, lideraron siglos de progreso científico sin precedentes, crecimiento económico y éxito de ingeniería. Como señalara Fritjof Capra (1982), la NASA no habría tenido la posibilidad de poner un hombre en la Luna sin Rene Descartes. La herencia, sin embargo, es definitivamente una bendición mezclada. Los Factores Humanos y la Seguridad de Sistemas están estancados con un lenguaje, con metáforas e imágenes que enfatizan la estructura, componentes, mecánicas, partes e interacciones, causa y efecto. Mientras nos dan la dirección inicial para construir sistemas seguros y para figurarnos lo que estuvo mal, cuando cambia no lo hacemos nosotros, hay límites para la utilidad de este vocabulario heredado. Regresemos a ese día de verano de 1999 y a la carrera en pista del MD-80. En buena tradición Newtoniana-Cartesiana, podemos comenzar abriendo el avión un poco más, separar los diversos componentes y procedimientos para ver como interactúan, segundo a segundo. Inicialmente seremos alcanzados por el éxito empíricamente resonante – como de hecho Descartes y Newton frecuentemente fueron. Pero cuando queremos recrear el todo en base a las partes que encontramos, una realidad más problemática salta a la vista: Ya no todo va bien. La exacta, matemáticamente placentera separación entre causa humana y mecánica, entre episodios sociales y estructurales, se ha derribado. El todo ya no se ve más como una función linear de la suma de sus partes. Como explicara Scout Snook (2000), los dos pasos clásicos occidentales de reducción analítica (el todo en partes) y síntesis inductiva (las partes de vuelta en el todo nuevamente) parecen funcionar, pero simplemente juntando las partes que encontramos no captura la rica complejidad oculta dentro y alrededor del incidente. Lo que se necesita es una integración orgánica, holística. Lo que tal vez es necesario es una nueva forma de análisis y síntesis, sensible a la situación total de la actividad sociotécnica organizada. Pero primero examinemos la historia analítica, componencial. SPOILERS, PROCEDIMIENTOS Y SISTEMAS HIDRÁULICOS Los spoilers son esos flaps que se levantan al flujo de aire en la parte superior de las alas, luego que la aeronave ha tocado tierra. No solo contribuyen a frenar la aeronave al obstruir la corriente de aire, sino que además, causan que el ala pierda la capacidad de crear sustentación, forzando el peso de la aeronave en las ruedas. La extensión de los ground spoilers acciona además el sistema de frenado automático en las ruedas. Mientras más peso llevan las ruedas, más efectiva se vuelve su frenado. Antes de aterrizar, los pilotos seleccionan el ajuste que desean en el sistema de frenado de ruedas automático (mínimo, medio o máximo), dependiendo del largo y condiciones de la pista. Luego del aterrizaje, el sistema automático de frenado de ruedas disminuirá la velocidad de la aeronave sin que el piloto tenga que hacer algo, y sin dejar que las ruedas deslicen o pierdan tracción. Como tercer mecanismo para disminuir la velocidad, la mayoría de los aviones jet tiene reversores de impulso, que direccional el flujo saliente de los motores jet en contra de la corriente de aire, en vez de hacerlo salir hacia atrás. En este caso, no salieron los spoilers, y como consecuencia, no se accionó el sistema de frenado automático de ruedas. Al correr por la pista, los pilotos
  • 159. verificaron el ajuste del sistema de frenado automático en mñultiples oportunidades, para asegurarse que se encontraba armado e incluso cambiando su ajuste a máximo, al ver acercarse el final de la pista. Pero nunca engancharía. El único mecanismo remanente para disminuir la velocidad de la aeronave era el empuje reverso. Los reversores, sin embargo, son más efectivos a altas velocidades. Para el momento en que los pilotos se percataron que no iban a lograrlo antes del final de la pista, la velocidad era ya bastante baja (ellos terminaron saliendo al campo a 10-20 nudos) y los reversores no tenían entonces un efecto inmediato. A medida que el jet salía por el borde de la pista, el capitán cerraba los reversores y desplazaba la aeronave algo a la derecha para evitar obstáculos. ¿Cómo se arman los spoilers? En el pedestal central, entre los dos pilotos, hay una cantidad de palancas. Algunasson para los motores y reversores de impulso, una es para los flaps, y una para los spoilers. Para armar los ground spoilers, uno de los pilotos debe elevar la palanca. La palanca sube aproximadamente una pulgada y permanece allí, armada hasta el toque a tierra. Cuando el sistema sensa que la aeronave está en tierra (lo que hace en parte mediante switches en el tren de aterrizaje), la palanca regresa automáticamente y los spoilers salen. Asaf Degani, quien estudió tales problemas procedimentales en forma extensa, ha llamado el episodio del spoiler no como uno de error humano, sino uno de cronometraje (timing) (ejemplo, Degani, Heymann & Shafto, 1999). En esta aeronave, como en muchas otras, los spoilers no deberían ser armados antes que se haya seleccionado el tren de aterrizaje abajo y se encuentre completamente en posición. Esto tiene que ver conlos switches que pueden indicar cuando la aeronave se encuentra en tierra. Estos son switches que se comprimen a medida que el peso de la aeronave se asienta en las ruedas, pero no sólo en esas circunstancias. Existe un riesgo en este tipo de aeronave, que el switch en el tren de nariz se comprima incluso mientras el tren de aterrizaje está saliendo de su asentamiento. Ello puede ocurrir debido a que el tren de nariz se despliega en la corriente de aire de impacto. A medida que el tren de aterrizaje está saliendo y la aeronave se desliza en el aire a 180 nudos, la pura fuerza del viento puede comprimir el tren de nariz, activar el switch y seguidamente arriesgar extendiendo los ground spoilers (si se encontrasen armados). No es una buena idea: La aeronave podría tener problemas volando con los ground spoilers fuera. Por lo tanto, el requerimiento: El tren de aterrizaje necesita estar durante todo su recorrido hacia fuera, apuntando abajo. Sólo cuando no exista más riesgo de compresión del switch aerodinámica, los spoilers pueden ser armados. Este es el orden de los procedimientos before-landing: Gear down and locked. Spoilers armed. Flaps FULL. En una aproximación típica, los pilotos seleccionan abajo la manivela del tren de aterrizaje cuando el llamado glide slope se encuentra vivo: cuando la aeronave ha entrado en el rando de la señal electrónica que la guiará hacia abajo a la pista. Una vez que el tren de aterrizaje se encuentra abajo, los spoilers deben ser armados. Entonces, una vezque la aeronave captura ese glide slope (por ejemplo, está exactamente en la marcación elecrónica) y comienza a descender en la aproximación a la pista, los flaps necesitan ser ajustados a FULL (típicamente 40º). Los flaps son otros aparatos que se
  • 160. extienden desde el ala, cambiando su forma y tamaño. Ellos permiten a la aeronave volar más lento para un aterrizaje. Esto condiciona los procedimientos al contexto. Ahora se ve así: Gear down and locked (cuando el glide slope esté vivo). Spoilers armed (cuando el tren esté abajo y asegurado). Flaps FULL (cuando el glide slope esté capturado). ¿Pero cuanto toma pasar desde “glide slope vivo” a “glide slope capturado”? En una típica aproximación (dada la velocidad) esto toma alrededor de 15 segundos. En un simulador, donde toma lugar el entrenamiento, esto no crea problema. El ciclo completo (desde la palanca del tren abajo hasta la indicación “gear down and locked” en la cabina), toma alrededor de 10 segundos. Eso deja 5 segundos para armar los spoilers, antes que la tripulación necesite seleccionar flaps FULL (el ítem siguiente en los procedimientos). En el simulador, entonces, las cosas se ven como esto: En t = 0 Gear down and locked (cuando el glide slope esté vivo). En t + 10 Spoilers armed (cuando el tren esté abajo y asegurado). En t + 15 Flaps FULL (cuando el glide slope esté capturado). Pero en una aeronave real, el sistema hidráulico (que, entre otras cosas, extiende el tren de aterrizaje), no es tan efectivo como en un simulador. El simulador, desde luego, solo simula los sistemas hidráulicos de la aeronva, modelado en como se encuentra la aeronave cuando tiene cero horas voladas, cuendo está reluciente de nuevo, salido de fábrica. En una aeronave más vieja, puede tomar hasta medio minuto al tren realizar el ciclo y quedar asegurado. Ello hace que los procedimientos se vean algo así: En t = 0 Gear down and locked (cuando el glide slope esté vivo). En t + 30 Spoilers armed (cuando el tren esté abajo y asegurado). ¡Pero! en t + 15 Flaps FULL (cuando el glide slope esté capturado). En efecto, entonces, el ítem “flaps” en los procedimientos molesta antes del ítem “spoilers”. Una vez que el ítem “flaps” está completo y la aeronave desciende hacia la pista, es fácil continuar con los procedimientos desde allí, con los ítems siguientes. Los spoilers nunca se arman. Su armado ha caído entre los quiebres de una combadura de tiempo. Una exclusiva declaración de error humano (o quiebre de CRM), se vuelve más difícil de sostener frente a este trasfondo. ¿Qué tanto error humano hubo, en verdad? Permanezcamos dualistas por ahora y revisitemos la Ecuación 1. Ahora apliquemos una definición más liberal de falla mecánica. El tren de nariz de la aeronave real, ajustado con un switch de compresión, está diseñado de forma tal que se pueda desplegar en el viento mientras se vuela. Esto introduce una vulnerabilidad mecánica sistemática que solamente es tolerada mediante pausas procedimentales (un mecanismo de conocidos agujeros contra la falla): primero el tren, luego los spoilers. En otras palabras, “gear down and locked” es un prerrequisito mecánico para el armado de los spoilers, pero el ciclo completo del tren puede tomar más tiempo del figurado en los procedimientos y las pausas de eventos que dirigen su aplicación. El sistema hidráulico de los viejos jets no presuriza tan bien: Puede tomar hasta 30 segundos para un tren de aterrizaje realizar el ciclo hacia fuera. El simulador de vuelo, en contraste, realiza el mismo trabajo dentro de 10 segundos, dejando una sutil pero sustantiva incongruencia. Una secuencia de trabajo es introducida y practicada durante el, mientras que una delicadamente diferente es necesaria para las
  • 161. operaciones reales. Mas aún, esta aeronave tiene un sistema que advierte si los spoilers no están armados en el despegue, pero no tiene un sistema para advertir que los spoilers no están armados en la aproximación. Entonces ahí está el arreglo mecánico en el cockpit. La palanca de spoiler armado luce diferente de la de spoiler no armado sólo por una pulgada y un pequeño cuadrado rojo en el fondo. Desde la posición del piloto en el asiendo derecho (quien necesita confirmar su armado), este parche rojo se oscurece detrás de las palancas de potencia mientras estas se encuentran en la posición típica de aproximación. Con tanta contribución mecánica alrededor (diseño del tren de aterrizaje, sistema hidráulico erosionado, diferencias entre el simulador y la aeronave real, distribución de las palancas del cockpit, falta de un sistema de advertencia de los spoilers durante la aproximación, pausas en los procedimientos) y una contribución de planificación estocástica (dos capitantes en este vuelo), una falla mecánica de mucho mayor magnitud podría ser adherida a la ecuación para rebalancear la contribución humana. Pero eso todavía es dualista. AL reensamblar las partes que encontramos entre procedimientos, pausas, erosión mecánica, trade-offs de diseño, podemos comenzar a preguntar donde realmente terminan las contribuciones mecánicas, y donde comienzan las contribuciones humanas. La frontera ya no está tan clara. La carga impuesta por un viento de 180 nudos en la rueda de nariz se transfiere a un débil procedimiento: primero el tren, luego los spoilers. La rueda de nariz, desplegándose al viento y equipada con un switch de compresión, es incapaz de acarrear esa carga y garantizar que los spoilers no se extenderán, por lo que en su lugar, un procedimiento tiene que llevar la carga. La palanca del spoiler está ubicada en una forma que hace difícil su verificación, y un sistema de advertencia para spoilers no armados no se encuentra instalado. Nuevamente, el error está suspendido, inestable, entre la intención humana y el hardware de ingeniería – pertenece a ambos y a ninguno únicamente. Y entonces está esto: El desgaste gradual de un sistema hidráulico no es algo que haya sido tomado en cuenta durante la certificación del jet. Un MD-80 con un sistema hidráulico anémico que toma más de medio minuto para llevar todo el tren fuera, abajo y asegurado, violando el requerimiento de diseño original por un factor de tres, aún se considera aeronavegable. El sistema hidráulico desgastado no puede ser coniderado una falla mecánica. No deja al jet en tierra. Ni tampoco lo hace la palanca del spoiler de difícil verificación, ni la falta de un sistema de advertencia durante la aproximación. El jet fue certificado como aeronavegable con o sin todo ello. Que no haya falla mecánica, en otras palabras, no es porque no existan asuntos mecánicos. No existe falla mecánica por los sistemas sociales, hechos por los manufactureros, reguladores, y operadores prospectivos – indudablemente formados por preocupaciones prácticas y expresado a través de juicio de ingeniería situado con incertidumbre sobre el desgaste futuro – decidieron que ahí no podría haber ninguno (al menos no relacionado con los asuntos ahora identificados en la corrida de un MD-80). ¿Dónde termina la falla mecánica y comienza el error humano? Al excavar sólo lo suficientemente profundo, la pregunta se vuelve imposible de responder.