"The comparison of widely varying text editors has only recently evolved beyond subjective preference and name-calling."
- Nathaniel S. Borenstein, 1985
The "My editor is better than your editor" argument easily comprises the longest-running continuous argument in computer programming. One can easily dismiss most of the common arguments on the topic, since the argument-makers appear ill-informed, no definitions of terms ever get offered or agreed-upon, hidden assumptions riddle the arguments and subjective preference carries more weight than experiment. Nevertheless, editor users bring up important points on ease-of-use, editing power, and what sort of interface an editor possesses. Despite endless discussion, poorly-formed concepts like "easy", "powerful", "consistent" "intuitive" and their opposites appear in most of the arguments. No two arguers agree on what the terms mean.
In order to form more perfect arguments, I present a first cut at a bibliography of real research that seems directed toward finding the perfect editor. I did not perform an exhaustive literature search, so please inform me of any missing citations. I'm missing electronically-retrievable forms for almost all of these papers.
Three text editors were studied using the editor evaluation methodology developed by Roberts and Moran. The results are presented as an extension of the studies by Roberts and Moran, with comparisons to the editors they studied earlier. In addition, supplementary measurements were taken that suggest minor flaws in the Roberts and Moran methodology. Further problems with the methodology are discussed, with an eye toward improving the methodology in future use. Although three significant problems with the methodology are reported, the problems are interesting primarily as lessons of the design of future evaluation methodologies. The Roberts and Moran methodology remains largely useful for the purposes for which it was designed.
This paper attempts to confirm Roberts, 1979, and Roberts & Moran, 1983
We describe a large scale, low cost project that has examined the way people develop their skill in using fundamental software tools. The study involved over two thousand users during a three-year period of use of the sam text editor. The work took place while the editor was being employed in normal day to day work - it was not a laboratory experiment.
Our main contributions are first to demonstrate very long-term, low-cost monitoring with collections of simple analysis tools. Second, we have started to develop an understanding of how usability changes in the long term. Third, studies of usability often concentrate on assessment before a system is released for widespread use, whereas ours can help inform the long term design of new tools - a different dimension of usability. In addition we have mixed snap-shot studies with descriptions of long-term, gradual change. We can track the full development of the user, even though the quality of the data is lower than that normally associated with usability studies.
Documentation on the sam editor used in the above study.
See also: Good 85, Whiteside, Wixon & Good, 1982 Thomas, 1998
This book apparently grew out of Cook, Kay, Ryan & Thomas. One of the authors of that report wrote this book. The Amazon.com review sounds very interesting.
Many different human factors techniques are available to the designer of a new computer system. This case study examines how one technique, the use of logging data, was used throughout the design of a new text editor which is measurably easy to learn and easy to use. Logging data was used in four areas: keyboard design, the initial design of the editor's command set, refinements made later in the design cycle, and the construction of a system performance benchmark.
They ended up with the EVE editor, running under the VMS operating system. All about EVE.
Despite the efforts outlined in the above paper, several EVE users found it lacking. Users can rewrite at least some portions of EVE, because its programmers wrote EVE in an extension language, much like Emacs. Arthur E. Ragosta created ADAM to "correct weaknesses in EVE." The Saskatchewan Cancer Foundation modified EVE to produce WEVE. WEVE "is an EVE editor interface that has been enhanced".
See also: Cook, Kay, Ryan & Thomas, Whiteside, Wixon & Good, 1982
Iterative design has been strongly recommended as part of a basic design philosophy for building measurably easy-to-use computer software. Iterative design was a major technique for meeting the specified usability goals for Eve, a new text editor for the VAX/VMS operating system. There was no adverse effect on the project schedule. Users' problems followed similar patterns to those encountered in earlier laboratory experiments on operating systems.
The impression that the phrase "this interface feature is intuitive" leaves is that the interface works the way the user does, that normal human "intuition" suffices to use it, that neither training nor rational thought is necessary, and that it will feel "natural." We are said to "intuit" a concept when we seem to suddenly understand it without any apparent effort or previous exposure to the idea. In common parlance, intuition has the additional flavor of a nearly supernatural ability humans possess in varying degrees. Given these connotations, it is as uncomfortable a term in formal HCI studies as it is a common one in non-technical publications and in informal conversation about interfaces.
Strictly speaking, the above reference doesn't fit in the category of "research" about text editors. It more-or-less constitutes an opinion piece on the definition of "intuitive".
In the first of two studies of "naturalness" in command names, computer-naive typists composed instructions to "someone else" for correcting a sample text. There was great variety in their task-descriptive lexicon and a lack of correspondence between both their vocabulary and their underlying conceptions of the editing operations and those of some computerized text editors. In the second study, computer-naive typists spent two hours learning minimal text editing systems that varied in several ways. Lexical naturalness (frequency of use in Study 1) made little difference in their performance. By contrast, having different, rather than the same names for operations requiring different syntax greatly reduced difficulty. It is concluded that the design of user-compatible commands involves deeper issues than are captured by the slogan "naturalness." However, there are limitations to our observations. Only initial learning of a small set of commands was at issue and generalizations to other situations will require further testing.
This experiment studies the acquisition and retention of text editor command sets. Previous research has focused on consistent, mnemonic sets versus inconsistent, non-mnemonic sets (Lee & Polson, 1988; Walker & Olson, 1988). Although results from these studies indicate that consistent, mnemonic sets are better than inconsistent, non-mnemonic sets, it is unclear whether consistency or mnemonics was the basis for better subject performance. This study sought to separate this confound. In order to do this, an condition in addition to those used in the Walker and Olson study was used in which the commands were consistent but non-mnemonic.
The results indicated that the consistent, non-mnemonic condition was learned and recalled better than the inconsistent, non-mnemonic condition, but worse than consistent, mnemonic condition. In addition, retroactive and proactive inhibition were found for those subjects learning both consistent, non-mnemonic, and inconsistent, non-mnemonic sets. Thus, consistency in text editor commands may not be enough; mnemonics may also be needed.
An experiment is reported in which subjects previously naive to text editing learned to use a set of editing commands. Some subjects used abbreviations from the beginning. Others began by using full command names, then switched to the (optional) use of abbreviations, either of their own devising or of our selection. We found significant differences in the number and nature of the errors produced by subjects in the different conditions. People who created their own abbreviations did most poorly, and did not appear to learn from this experience. Those who used abbreviations from the start were more likely to fall into error through misrecalling the referent names. The results suggest aspects of the underlying cognitive representations, with implications for the design of software interfaces.
Bibliography of this paper appears very relevant to discussions of command-line interfaces (shells).
We propose a simple and powerful predictive interface technique for text editing tasks. With our technique called the dynamic macro creation, when a user types a special "repeat" key after doing repetitive operations in a text editor, an editing sequence corresponding to one iteration is detected, defined as a macro, and executed at the same time. Although being simple, a wide range of repetitive tasks can be performed just by typing the repeat key. When we use another special "predict" key for conventional prediction techniques in addition to the repeat key, wider range of prediction schemes can be performed depending on the order of using these two keys.
The purpose of the study was to identify the customization changes users typically make to their word processor. Ninety-two percent of the participants customized their software in some way. Participants who used the software most heavily also did the most customization (p < .05). Most of the customization was done to facilitate the participants' work practices. The most common changes involved providing easier access to custom or often-used functionality. Button Bars seemed to provide an easy and effective means for participants to customize access to the functionality they wanted. Few participants customized the visual appearance of the interface.
They studied the WordPerfect 6.0a word processor for Windows.
Most of the Open Source editors (Vim, vile, EMACS, Xemacs) allow, and even encourage, extensive customization of the editor's appearance. People cite ease of customization as an advantageous feature of the editors that support it. Why did the authors of this study find so few participants doing that kind of customization?
Apparently also appears in:
Watch What I Do: Programming by DemonstrationSyntax-directed editors were created with the intent of aiding in and improving the programming process. Despite their potential, they have not been successful, as evidenced by limited use. In general, they are perceived as being too difficult to use and the benefits of their use are outweighed by the difficulties.
We believe that the cognitive styles and skills of the users have been ignored in the design process. In this paper we present some of our initial results which show that cognitive styles vary over a significant spectrum and that their consideration in the design of a syntax-directed editor will result in an intelligent tool that will be right for the cognitive skills and expertise of an individual user. In turn, and approach to design that takes cognitive variation into account would support the construction of syntax-directed editors which are successfully used.
The issue we are concerned with in this paper is not that of modes in general, but rather the more specific question of how editors should handle text insertions. In this context, moded editing means that the editor user must enter a special command before text is inserted and another special command to end the text insertion and return to the command mode. Ordinary printing characters typed while in insertion mode are entered as text. The same characters entered while the editor is in the command mode are treated as commands. Modeless editing is different; ordinary characters are entered directly as text. There are no special command required to enter or stop entering text.
Experienced emacs and vi users, who use their editors to write and edit English text, performed a series of basic editing tasks and wrote a movie or book review. Our findings suggest that moded editing, as exemplified by the vi editor, may be preferable for fixed editing tasks, while modeless editing, as exemplified by the emacs editor, may have some advantages for free composing.
The vi subjects left fewer uncorrected errors in their final files than did the emacs subjects when doing fixed editing tasks from marked up hard copy. The emacs group tended to take longer to complete the editing tasks, but the time differences may have resulted from differences in typing speed.
Moded errors do not seem to be a problem for experienced vi users. The vi group made few moded errors, and those few were rapidly corrected. Futhermore, modeless editing may not totally avoid moded type errors, since the emacs group made errors that were similar in nature to the vi moded errors.
Modeless editing may allow people to edit more freely while they are composing. The emacs subjects tended to do more of their editing in the first draft stage of composing than did the vi group. However the two groups were similar in total editing, composing time, and most writing style variables.
The ideal editor design would combine moded and modeless editing features. To do this, we suggest adding some basic cursor moving and delete commands using control character names to the inserting text modes. This design provides the editing advantages of moded editing, and, at the same time, allows people who like to edit while composing to do so without having to exit and reenter insertion modes.
This study was not an evaluation of emacs or vi as total editors. Both are sophisticated editors that provide many features not explored in this study. Although we interpret our results as showing that vi was preferable for the fixed editing tasks investigated, we can easily think of cases in which emacs would be a far more desirable editor.
People usually consider emacs and vi only as "programmer's editors". Based on the editing tasks examined, the authors of this paper wanted to investigate "moded vs modeless" for a wider context.
Thanks to Eric Fischer of the University of Chicago for finding, copying and mailing me this paper. I remain deeply in debt to him.
A model of manuscript editing, implemented as a simulation program, is described in this paper. The model provides an excellent, quantitative description of learning, transfer and performance data from two experiments on text editing methods. Duplications of the underlying theory for the design process are briefly discussed.
In 1984 it might have seemed appropriate to use "computer-naive" people to understand text editing. In 2002 that same usage might seem naive.
This paper describes a successful test of a quantitative model that accounts for large positive transfer effects between similar screen editors, between different line editors and from line editors to a screen editor, and between text and graphic editors. The model is tested in an experiment using two very similar full-screen text-editors differing only in the structure of their editing commands, verb-noun vs. noun-verb. Quantitative predictions for training time were derived from a production system model based on the Polson and Kieras model of text editing.
Rosson did extensive research on users of a text-editor. She found that programmers made extensive use of customization features (reassigning keys and writing macros) but secretaries did not. She also found that the amount of experience with the editor and text-editors in general was a good predictor of the amount of customization done.
Also said to appear in SIGOA Newsletter 3, 1 (June 1982), pp 29-40.
Abstract:Keystroke statistics were collected on editing systems while people performed their normal work. Knowledge workers used an experimental editor, and secretaries used a word processor. Results show a consistent picture of free use patterns in both settings. Of the total number of keystrokes, text entry accounted for approximately 1/2, cursor movement for about 1/4, deletion for about 1/8, and all other functions for the remaining 1/8. Analysis of keystroke transitions and editing states is also presented. Implications for past research, editor design, keyboard layout, and benchmark tests are discussed.Commentary from Good, 1985:
EPT (Editor Prototyping Tool) is a small editor with 29 commands, designed specifically for research purposes. The logging sample of over 500,000 keystrokes was collected from six members of a human factors research group over a two-month period, measuring 212 person-hours of use.
See also: Good 85, Cook, Kay, Ryan & Thomas,
Performance and subjective reactions of 76 users of varying levels of computer experience were measured with 7 different interfaces representing command, menu and iconic interface styles. The results suggest three general conclusions:
- there are large usability differences between contemporary systems,
- there is no necessary tradeoff between ease of use and ease of learning
- interface style is not related to performance or preference (but careful design is).
Difficulties involving system feedback, input forms, help systems and navigation aids occurred in all styles of interface, command, menu and iconic. New interface technology did not solve old human factors problems.
Nine participants used a full screen computer text-editor (XEDIT) with an IBM 3277 terminal to edit marked-up documents at each of three cursor speeds (3.3, 4.7 and 11.0 cm/sec.). Results show that 9% of editing time was spent controlling and moving the cursor, regardless of cursor speed. The variations in cursor speed studied did not seem to act as a pacing device for the entire editing task.
Green and Payne regularized EMACS commands along a single set of organizing principles and produced dramatically better performance in a learning task.
In 1988, a two-day workshop of 15 experts was unable to produce a definition of consistency.
Jakob Nielsen edited a book with this same title. I don't know
what relationship the book and the SIGCHI bulletin possess.
Coordinating User
Interfaces for Consistency
Academic Press, Boston, MA, 1989. ISBN 0-12-518400-X
(hardcover)
However, ease of learning can conflict with subsequent ease of use When this happens, priorities must be established carefully. If learning isn't possible, use will not happen. However, people buy systems and applications not to learn them, but to use them. [...] If a consistent interface supports learning but impedes skilled performance, then consistency is working against good design.Excerpt:
No rule, consistently applied, produces good menu defaults. Enforcing a blanket consistency will damage the interface.
A major goal of the DECwindows program is to provide a consistent, state-of-the-art user interface for workstation software. This interface extends across operating systems and many different types of application programs. Within the DECwindows program we have addressed both the technical and organizational aspects of developing consistent user interfaces across applications. Traditional methods for developing user interface consistency, such as the use of an interface style guide and toolkit, were supplemented with more innovative techniques. An exhibition and catalog of DECwindows application designs helped to develop a DECwindows school of interface design. Electronic conferencing software played an important role in facilitating communication among DECwindows contributors throughout the company. Preliminary user interviews suggest that the DECwindows interface style gives a consistent, usable feel to Digital's workstation applications.Documents how Digital Equipment Corporation (R.I.P.) developed a user interface guide and user interface style. As I recall DECWindows constituted an X11 window manager and widget style, but I only used DECWindows briefly many years ago. An early version of this paper appeared in Jakob Nielsen's 1988 Coordinating user interfaces for consistency conference.
New York Times article describing PC keyboard layout and its history.
In the economic literature on standards, the popular real-world example of this type of market failure is the standard Qwerty typewriter keyboard and its competition with the Dvorak keyboard. This example is noted frequently in newspaper and magazine reports, seems to be generally accepted as true [...].
We show that David's version of the history of the market's rejection of Dvorak does not report the true history, and we present evidence that the continued use of Qwerty is efficient given the current understanding of keyboard design. We conclude that the example of the Dvorak keyboard is what beehives and lighthouses were for earlier market-failure fables. It is an example of market-failure that will not withstand rigorous examination of the historical record.
Liebowitz and Margolis attempt to discredit reports of efficiency of Dvorak keyboard so they can justify a doctrinaire free market interpretation of why we still use the QWERTY keyboard. In a 1996 Reason magazine article, they really tip their hand about the motivation for this paper.
Said to be why Xerox chose the mouse as pointing device for the seminal "Alto" computer, the first GUI-based computer.
Abstract:
Skilled programmers, working on natural tasks, navigate large information displays with apparent ease. We present a computational cognitive model suggesting how this navigation may be achieved. We trace the model on two related episodes of behavior. In the first, the user acquires information from the display. In the second, she recalls something about the first display and scrolls back to it. The episodes are separated by time and by intervening displays, suggesting that her navigation is mediated by long-term memory, as well as working memory and the display. In the first episode, the model automatically learns to recognize what it sees on the display. In the second episode, a chain of recollections, cued initially by the new display, leads the model to imagine what it might have seen earlier. The knowledge from the first episode recognizes this image, leading the model to scroll in search of the real thing. This model is a step in developing a psychology of skilled programmers working on their own tasks.
This excerpt explains why I include this paper:
String searching was used only three times. One instance succeeded, with roughly 2 pages between start position and target. The two other instances failed, with the string not found. Both times the user then tried scrolling; both scrolling sequences also failed, one after 3 pages and the other after 6 pages. While the very limited use of methods may seem surprising, it is consistent with a finding that experienced users use small subsets of the commands available to them in an editor, ignoring even important cursor-movement commands.
This paper constitutes an archetypical example of researchers not really testing "expert" or "experienced" users. The designated expert made only limited use of string searching, and actually fell back on scrolling around in the text, apparently aimlessly. I believe that very few real experts get studied. Most of the papers on this page examine raw, completely uneducated users. The conclusions most of the authors draw seem inapplicable to real experts and sophisticated users.
Said to confirm that even expert users use only small subsets of editor commands. Might be worth comparing with findings of Cook, Kay, Ryan & Thomas.
This paper proposes the famed "Fitt's Law" often used to justify a particular menu-bar layout
Abstract:
This paper reports on an experiment that investigated factors which effect selection time from walking menus and bar or pull-down menus. The primary focus was on the use of impenetrable borders and on expanding target areas on the two menus [sic] types. The results show that both factors can be used to facilitate menu selection, with the use of borders being most beneficial. In addition, the time required to access items from a bar menu is less than that required for the best walking menu.
Lab work justifying a particular menu-bar layout
Abstract:
Trajectory-based interactions, such as navigating through nested-menus, drawing curves, and moving in 3D worlds, are becoming common tasks in modern computer interfaces. Users' performances in these tasks cannot be successfully modeled with Fitts' law as it has been applied to pointing tasks. Therefore we explore the possible existence of robust regularities in trajectory-based tasks. We used "steering through tunnels" as our experimental paradigm to represent such tasks, and found that a simple "steering law" indeed exists. The paper presents the motivation, analysis, a series of four experiments, and the applications of the steering law.
This is a more modern reference on Fitt's Law.
Chapter 6, Cognition and Human Information Processing has Case Study C: Text Editors and Word Processors.
True Tales of the Origin of vi.
Abstract:
This paper reports on the results of an experiment that was run in order to help determine if colour or font size was more useful for displaying code in a programming task, and if so, which was more useful. The null hypothesis of the experiment was that neither colour nor font size were of any benefit to users in programming tasks. The null hypothesis was refuted. It was determined that the colour display mechanism both lessened the time taken to perform a code optimization task, and was preferred by subjects. The use of the font size display mechanism showed no significant benefits.
Excerpt:
Despite the advent of WYSIWYG editors and graphical symbolic debuggers, an easy way to pick a fight in a group of software engineers is to express a preference for one of the two old war horses vi and emacs. Both are horrendously unusable, yet the loyalty remains and the battles rage on.
This study compares the two text editors vi and emacs using first a theoretical approach based on some possible criteria of a good user interface, and then using an experimental approach by designing a World Wide Web page as a discussion forum.
It was discovered that emacs is superior in most respects to vi for novice users who have not learned either editor yet. Although emacs is easier to learn and more powerful than vi, vi can be used by experienced users to produce the same results for editing text as emacs.
For experienced users in one of the two editors, changing editors will not provide any advantages for the user, and will only consume time.
The editors vi and emacs were compared using a simple time-based experimental method involving common text manipulations and a post-test opinion survey by questionnaire. The subjects were twelve students; six were novices and six were regular users. Significant objective performance differences were confined to the novice users; here emacs consistently outperformed vi with respect to time taken to perform the tasks and the amount of help needed. Subjectively, novices preferred emacs because of its more predictable nature. Emacs was therefore the editor of choice for the novice users tested. There appears to be no advantage for a regular user of one editor to switch to the other.
Mr Knottenbelt permitted me to hold a copy of his essay.
Sam is an interactive multi-file text editor intended for bitmap displays. A textual command language supplements the mouse-driven, cut-and-paste interface to make complex or repetitive editing tasks easy to specify. The language is characterized by the composition of regular expressions to describe the structure of the text being modified. The treatment of files as a database, with changes logged as atomic transactions, guides the implementation and makes a general `undo' mechanism straightforward.
Sam is implemented as two processes connected by a low-bandwidth stream, one process handling the display and the other the editing algorithms. Therefore it can run with the display process in a bitmap terminal and the editor on a local host, with both processes on a bitmap-equipped host, or with the display process in the terminal and the editor in a remote host. By suppressing the display process, it can even run without a bitmap terminal.
This paper contains both a tutorial in the use of sam and a description of its implementation.
The data structure used to maintain the sequence of characters is an important part of a text editor. This paper investigates and evaluates the range of possible data structures for text sequences. The ADT interface to the text sequence component of a text editor is examined. Six common sequence data structures (array, gap, list, line pointers, fixed size buffers and piece tables) are examined and then a general model of sequence data structures that encompasses all six structures is presented. The piece table method is explained in detail and its advantages are presented. The design space of sequence data structures is examined and several variations on the ones listed above are presented. These sequence data structures are compared experimentally and evaluated based on a number of criteria. The experimental comparison is done by implementing each data structure in an editing simulator and testing it using a synthetic load of many thousands of edits. We also report on experiments on the sensitivity of the results to variations in the parameters used to generate the synthetic editing load.
Apparently out of print, currently available on-line only
We describe an integrated collection of algorithms and data structures to serve as the basis for a practical incremental software development environment. A self-versioning representation provides a uniform model that embraces both natural and programming language documents in a single, consistent framework. Software artifacts in this representation provide fine-grained change reports to all tools in the environment. We then present algorithms for the initial construction and subsequent maintenanceof persistent, structured documents that support unrestricted user editing. These algorithms possess several novel aspects: they are more general than previous approaches, address issues of practical importance, including scalability and information preservation, and are optimal in both space and time. Since deterministic parsing is too restrictive a model to describe some common programming languages, we also investigate support for multiple structural interpretations: incremental non-deterministic parsing is used to construct a compact form that efficiently encodes syntactic ambiguity. Later analyses may resolve ambiguous phrases through syntactic or semantic disambiguation. This result provides the first known method for handling C, C++, Fortran, and Cobol in an incremental framework derived from formal specifications. Our transformation and analysis algorithms are designed to avoid spurious changes, which result in lost information and unnecessary recomputation by later stages. We provide the first non-operational definition of optimal node reuse in the context of incremental parsing, and present optimal algorithms for retaining tokens and nodes during incremental lexing and parsing. We also exploit the tight integration between versioning and incremental analysis to provide a novel *history-sensitive* approach to error handling. Our error recovery mechanism reports problems in terms of the user's own changes in a language-independent, non-correcting, automated, and fully incremental manner. This work can be read at several levels: as a refinement and extension of previous results to address issues of scalability, end-to-end performance, generality, and description reuse; as a `cookbook' for constructing the framework of a practical incremental environment into which semantic analysis, code generation, presentation, and other services can be plugged; and as a set of separate (but interoperable) solutions to open problems in the analysis and representation of software artifacts. Our results are promising: in addition to embracing a realistic language model, both asymptotic and empirical measurements demonstrate that we have overcome barriers in performance and scalability. Incremental methods can now be applied to commercially important languages, and may finally become the standard approach to constructing language-based tools and services.
In this report the author describes Lilac, a working interactive system for typesetting complex documents. The novelty of the system lies in that it allows both "changes in the small" and "changes in the large" to be performed efficiently. More specifically, the user works with two views of the document. One view is a WYSIWYG view, emulating what will appear on the printed page. The other view displays the structure of the document as a tree of nested procedure calls - with the actual text appearing as arguments to those procedures. An "interpretation" of that tree yields the WYSIWYG view. The user is allowed to modify either the structure view or the WYSIWYG view. The former is preferable for large structure changes to the document; the latter for ordinary small edits.
It is the job of Lilac to update one view when the other changes, and to do so quickly. Accomplishing this has required careful design and analysis on many fronts: the language in which the document is specified, the data-structures for efficient incremental reevaluation of the document tree, the algorithms for performing selection in the document hierarchy, and finally the caching scheme used to save still-useful parts of the WYSIWYG view. All these pieces come together in Lilac, in a nicely integrated design.
This FTP site
carries several older articles on various text editor internals.
Look for files editech.1.Z
, editech.2.Z
,
editech.3.Z
, editech.4.Z
and
editech.5.Z
. The files comprise a "personal view" of
text editor implementation. The author used the buffer gap method,
which Charles Crowley also examines in his paper.
$Id: perfect.editor.html,v 2.9 2000/02/29 15:52:44 bediger Exp bediger $