The Text Editor Sim

[all poststhis post only]

sim is an interactive multi-file text editor designed for screens with monospaced font. It is heavily inspired on the 'vim' and 'sam' text editors, hence the name sim. The main goal of this software is portability. It only depends only on the C89 standard, and tries to minimize the usage of system calls as much as possible, making it easy (or not even necessary) to port to another environment.

The concept of sim was born around a year ago, and since then three versions have been created to test and mature its ideas.

The implementation

In order to understand the concept of sim, we need to first understand the design choices behind it, since sim's implementation is closely related to is behavior.

Strings

Strings are the core of sim. They are dynamically sized arrays of characters that grow in real time according to their need. Every transaction made with a String checks to see if its current allocated size is enough to fit a new character or another String, which makes it an extremely safe to use data structure. Strings are very light, fast and easy to modify, which makes them ideal for File manipulation.

Files

sim is a multi-file text editor. You can easily swap between the 8 concurrent files using Alt+Num or Esc+Num sequences (they are, in fact, the same thing for a computer). There is a global array containing all 8 Files, and a global File variable that points to the current File which is used for most operations on sim. Each file has its own Buffer. The Buffers are defined the same way as the Files, as a global array, with a global pointer pointing to the current Buffer. Each Buffer contains a list of operations like insert(), delete() and change() treated as transactions, much like in a database, which makes it very straightforward to implement undo/redo functionality on the File.

Files have an Address called dot, which although it resembles the sam dot, it is used with a slightly different approach. While moving around, both dot positions will always be the same. This changes in a situation where the first position is used to store the initial cursor location and the second position stores the current cursor location, for example, while selecting text or inserting text.

Each File also has a Frame attached to it. Frames are responsible for displaying and calculating which characters will be shown on screen. They will be covered next.

Addresses and Frames

Unlike traditional text editors, and much like sam, sim does not keep track of any lines on your file. Instead, it relies on the current position of your cursor relative to the file. This has several advantages over the common method of storing strings in an array. First, it uses much less memory than the traditional method, and also it does not impose any limits on how to perform operations on your file.

Say, for example. you wanted to issue a command that changes a expression than spans over multiple lines (a structure declaration, for example). That would simply not be possible on a traditional text editor like vim or ed. Instead, you would have to make your change manually, line by line, according to your needs.

With text editors like sim and sam, that is not an issue, since newline characters are treated just like normal characters. With this method, you can perform any operation, anywhere on the file. In fact, sim does not distinguish between whitespace and normal characters at all, which makes it possible to hover your cursor above a newline character. I believe this is a very important feature, since it makes users more aware of how lines work on text files, and opens up possibilities for easier line manipulation.

But how exactly does sim achieve that? How does it keep track of the file and cursor position over the screen? This is where the concept of a Frame comes in. A Frame is, in its essence, just a list of Addresses. It stores the current File dot position in one of its Address ranges, it then decides if it is better to flush the entire Address cache or to just change the position of the Frame dot, which is not to be confused with the File dot. The Frame dot is just the index on the Address list where the File dot is contained.

Frames are very smart and efficient, and they are intended to call fwrite() only once per update, which is why all the relevant text has to be in the same String. They are able to do this thanks to the blind_reader() and blind_writer() functions. These functions are responsible for blindly "encoding" or "decoding" a single physical line into one or more visual lines. I say "blindly" because they don't really encode or decode anything, they just parse each character, calculate how much space each one will take, and store an Address on a Address list, which can either be a temporary Frame, or directly stored in the File Frame.

The reason Frames are so efficient is because they store no data apart from File positions, so they have a very low memory footprint, and they are extremely fast since they don't care what exactly they are displaying, only how much space it is taking on screen.

The Frame implementation on sim is around 150 lines of code, including the blind_reader() and blind_writer() functions.

The cursor

Now that you have learned about Frames, you can understand how the cursor in sim works. Contrary to most text editors, including vim and sam, sim will always have its cursor placed on a fixed position. By default, it stays in the middle of the screen, but it can be easily modified to switch to any line that you wish to place it. The reason for that is simple and straightforward. While moving around, searching for terms, or writing text, it is ideal to predict where your cursor is going to be, lest you waste your time trying to find out its location. If your cursor is always in the same position, you will never have to worry about finding the next match or trying to adjust for inserting text somewhere else.

While writing this document, I realize how natural this cursor placement feels like, and I have not had a single issue with cursor positioning a single time. With this, you do not need any fancy search highlighting because it is simply not useful at this point. This was the main reason I created sim after all. All those amazing ideas and solutions came after that. It was thanks to these "limitations" that I have figured out very clever ways on how to solve text editing problems elegantly.

In order to get text to display properly, sim has to first calculate the line positions on top of the cursor, and then sim calculates the line positions on it and below it. The reason for that is that sim has to scan the file in reverse, until it either reaches the start of the file or there are enough lines in the cache for sim to feel comfortable before doing a full update. While moving around, sim will usually calculate a few lines more than necessary in order for the program to update less frequently.

Keys

The way sim implements modal editing like vim is by using the concept of Keys. Keys are easily configured by the user in the config.h file. In a Key, each character corresponds to a function, with an optional argument. In the main loop, the Key list defined in config.h is scanned for the specified key, and once it is found, the function associated with it is called, along with the optional argument. By calling a function, you can basically change the entire path of the program. You can create an entirely different new loop inside it to accommodate a new feature or new logic, or you can make a simple command that changes its behavior according to the next keypress.

The Key table is a very powerful feature of sim, which allows its users to create a vastly different feature or program, just by creating a new function and attaching it to a character.

Regardless of its usage, the main loop will always update the frame before calling the next function, which guarantees that there will not be any graphical artifacts or misalignments when the Key function is called. If something weird happens, that's on the function, not the Frame.

This is how all the functionality of sim is implemented: searching, inserting, deleting, doing/undoing, moving, quitting, and anything else. If you want to add something new, the Key table is probably the place you want to start at.

Final considerations

Porting sim

sim is very system independent. However, it still uses a few environment-specific functions, and it also requires VT100 compatibility. If your operating system complies with the posix standard, then you're on luck, you can just compile it and give it a go. If, however, that is not the case (e.g. inferior OSes) then you'll have to either create your own window/terminal querying/manipulation functions. I will not port it to any system that does not support the VT100 escape sequences or does not have posix compliance. You're on your own.

Implementing new features

Some features that most text editors possess have been intentionally left out of the program e.g. (syntax highlighting), since they are deemed either not necessary or they do not follow sim's philosophy (or both, e.g. syntax highlighting). To these much-needed-features-by-some-people, patches can be made to cover their needs. sim will have its own, separate page, alongside all its patches, for easier download and patching. Anyone can submit patches, and their analysis will be much lighter than commits in order to encourage new users to use sim.