Ctrl+Z

Implementing undo/redo in a digital audio workstation

Tags: technical, jackdaw, C

Context / Contents

Jackdaw is my free digital audio workstation, currently in progress.

In this post, I describe my implementation of undo/redo, and how it encouraged me to think more rigorously about the nature of the program state. At the end, I introduce a problem that arose in the context of this design, which motivated the creation of program “endpoints”, a refactor I intend to write about in another post.

Ctrl+Z is mandatory

Ever since Prometheus stole Ctrl+Z from the Gods in the 1970s somewhere over the U.S. state of California, the ability to undo and redo changes has been a crucial component of almost any interactive software application with a complex, mutable state. In starting work on my DAW, I knew that until it was endowed with an undo/redo feature, it would remain an infant, cute but basically unuseful and in need of constant supervision.

There are some such staples of program design that arguably have not earned their keep – blighted trees planted innocently decades ago whose tainted fruits we’ve learned to tolerate. In those cases, I might eschew convention and attempt to blaze my own trail, allowing myself at least the opportunity to arrive at something I think is better. Ctrl+Z is not such a case; it is already perfect, and its fruits are delicious.

Linear undo

A consequence of the elementality of undo/redo functionality is that its workings are rarely given intellectual articulation for a casual user, much like the grammar of one’s native tongue. Therefore, in implementing the feature, I was saddled with the delightful task of discovering consciously the nature of my own unconscious expectations for how it ought to work.

I found that my expectations aligned with the simplest version of “linear undo.” Linear undo is defined by a few principles:

Undoable events are logged in a list.
Movement can occur in the backward (undo) or forward (redo) direction.

The central tradeoff of simple linear undo becomes clear when you consider the ramifications of making a new change after some events have been undone, but never redone.

Here, seven events were done, and three have been undone. What happens when a new event is logged in the current state?

It’s tempting to simply insert the new event between the previously-undone (red) events and the older events, but this causes a nightmare of state management. Let’s say that one of those red events was a change to an audio track, and the new (green) event is the deletion of that track. Redoing the change, then, would attempt a modification of a deleted track. The situation can quickly spiral out of control if many new events are done – moving the program into an unrecognizable state – before one of the now-stale red events is redone. When we undo or redo a change, we generally want the program to be in the same state it was in when the change was initially done. We will examine this more closely later.

Another option would be to create a branch at the point of insertion. An undo tree has the advantage of comprehensively and unambiguously recording the program history, but the cost is ambiguity in the meaning of “redo”. At a branch point, the user would need to select which redo path they want to follow. The program would therefore need to present the user with some easily-understood representation of the branch before the user can make a decision. This would all be more trouble than it’s worth for a simple, keyboard-bound undo/redo feature, in my opinion.

I wrote above that Ctrl+Z is perfect, but what I really mean is that simple linear undo makes the right tradeoffs. The simple solution to this particular problem is to dispose of previously-undone events when a new event is logged. Those events are then lost forever, and can no longer be redone. I’m ok with that tradeoff, at least for now.

Aside: Emacs

The classic text editor Emacs has a clever solution to the problem that avoids the central tradeoff of simple linear undo as described above, without resorting to an undo tree. From their docs:

Consecutive repetitions of C-/ (or its aliases) undo earlier and earlier changes in the current buffer. If all the recorded changes have already been undone, the undo command signals an error.

Any command other than an undo command breaks the sequence of undo commands. Starting from that moment, the entire sequence of undo commands that you have just performed are themselves placed into the undo record. Therefore, to re-apply changes you have undone, type C-f or any other command that harmlessly breaks the sequence of undoing; then type C-/ one or more times to undo some of the undo commands.

In terms of the diagrams above, we could think about taking the red events, inverting each of them such that what we would consider the redo action associated with the event becomes the undo action (though there really is no concept of “redo” here), reversing their order, and then reinserting them into the linear history before tacking the new event on at the very end.

This is an elegant solution, but for better or worse, it has never felt as intuitive to me (I am an emacs user) as the simpler one, so I don’t feel compelled to replicate it in Jackdaw.

High-level implementation

You’ve already guessed from the diagrams above that I implemented the event history as a doubly-linked list. It could have just as easily been an array, but I was feeling listy (not listless) on implementation day.

Data structure

typedef struct user_event_history {
    UserEvent *oldest;
    UserEvent *next_undo;
    int len;
} UserEventHistory;

There’s only one of these UserEventHistory structs in any given Jackdaw project. Each UserEvent has a next and previous pointer, so those function as the nodes in the linked list.

The length is stored so that we can constrain it, since we don’t want our memory usage to explode as the user’s session stretches on. (We could also recalculate it each time we modify the list; it makes little difference at the lengths we’re working with).

Most of our operations on this list will occur around the next_undo event, which is next in line to be undone. An “undo” moves that pointer backward through the list, and “redo” moves it forward. The reason to store the oldest pointer (the head of the list) is to allow next_undo to be NULL when all of the available events have been undone, while still retaining access to the list. An undo is available as long as next_undo is not NULL, and a redo is available as long as next_undo->next is not NULL.

UserEvent (nodes)

Each node in the list contains all of the information needed to execute an undo or redo operation:

typedef struct user_event {
    EventFn undo;
    EventFn redo;
    EventFn dispose;
    EventFn dispose_forward;
    
    void *obj1;
    void *obj2;
    Value undo_val1;
    Value undo_val2;
    Value redo_val1;
    Value redo_val2;
    ValType type1;
    ValType type2;
    bool free_obj1;
    bool free_obj2;

    int index;
    UserEventHistory *history;
    UserEvent *next;
    UserEvent *previous;
} UserEvent;

Value is a generic type, defined here. An EventFn is a function pointer, and so could be called a “method” of the UserEvent object.

These user events are created and pushed into history anytime a new undoable event has been completed. Most of the members of the struct are passed directly to the function that does this:

UserEvent *user_event_push(

    /* The project-level history in which to insert the event */
    UserEventHistory *history,
    
    /* These "methods" must be defined before an event is pushed */
    EventFn undo_fn, 
    EventFn redo_fn,
    EventFn dispose_fn,
    EventFn dispose_forward_fn,
    
    /* Optional data to be passed to the EventFns */
    void *obj1
    void *obj2,
    Value undo_val1,
    Value undo_val2,
    Value redo_val1,
    Value redo_val2,
    ValType type1,
    ValType type2,
    
    /* If obj1 or obj2 are dynamically allocated for undo/redo 
    purposes only, set these to 'true' to ensure 'free' is
    called on the pointers before the UserEvent is freed. */
    bool free_obj1,
    bool free_obj2
);

We can pretty quickly understand at a high level what happens when the user triggers an undo:

Get the next_undo member of our project-level UserEventHistory.
If it exists, call its undo “method”, passing its stored data to that function so that the function has everything it needs to do its job.
Reset the UserEventHistory’s next_undo pointer to next_undo->previous.

Redo varies only in the node selection, the fact that the redo method is called instead of undo, and the resetting of next_undo at the end.

So then what are the dispose and dispose_forward methods for?

In the weeds: memory management

We already know that we want to constrain the length of our event history to avoid memory usage that keeps growing as new events are logged. This means that old events can “fall off” the end of the list:

The UserEvent nodes we’re dealing with are themselves dynamically allocated, so we will need to free them when they are no longer needed or accessible. But some types of events require additional cleanup.

Let’s say we record a long audio clip, and then delete it. We want to be able to undo that deletion, so the clip data must not actually be freed. The clip is merely removed from its parent data structures, so that it no longer appears on the track, for example.

Audio clips being deleted, and the deletions being undone

The clip data remains accessible on the user event. If the deletion event falls off the end of the event history, such that the clip deletion can no longer be undone, then we really do want to free the clip data. (That’s a lot of memory!) This is what the dispose method is for. In this example, the dispose method we assign will free the clip before the UserEvent itself is ultimately freed.

Events also need to be “disposed” when they become inaccessible at the front of the list, as in the scenario described above (some events have just been undone, and then a new event is logged). But this is not the same as dispose. If our audio clip was deleted, and that action was later undone, then the audio clip currently appears on our timeline, and should not be freed. So we don’t call the dispose method when the event is lopped off the front of the list; only when it falls off the back.

However, if an event that allocates new memory becomes inaccessible at the front of the list, that memory needs to be freed, because the user effectively deleted the allocated object with an undo that is no longer redoable. This is what dispose_forward is for. If we add a track, undo that action (thereby deleting the track), and then register some new action, rendering a redo of the previous action impossible, we need to free the track.

So, broadly, events that delete objects require a dispose method, and events that create objects require a dispose_forward method. Events that do neither require neither.

Do we really need all these stored pointers and values?

Jackdaw does not exclusively use keyboard commands – you can mouse if you want to – but the vast majority of its functionality is accessible through keyboard commands. It’s tempting to think that, in order to implement undo, we need only look at the user inputs that are considered reversible, write forward and backward versions of those functions, and then push an event each time an undoable input is registered. But it isn’t quite that simple.

When the user does an action, their input alone (one byte representing the key pressed, and two bytes representing the state of the modifier keys and mouse buttons) is insufficient to complete an action. Almost all of the user-interface functions listed in userfn.h, which are bound to keyboard inputs, must read from the program state before modifying it. The user merely selects which function to call.

The concept of a “cursor” in Jackdaw (one of its unique strengths!) is a clear example of this. The cursor is defined as the position on the current timeline in the currently-selected track at the current playhead position. The track selector (and therefore, the cursor) is moved up and down with p and n (or d and f), and the playhead can be moved forward and backward by various means, including l (play) and j (rewind).

For example, when the user triggers a “cut” command, the cut occurs at the cursor location:

Cuts being done, undone, and redone

When that command is undone, and then redone, the cursor may be in a completely different location; so we cannot simply re-run the same code we used before. We need to store the location of the cursor at the time the cut first occurred.

The Deep State

I wrote earlier that when we undo or redo a change, we generally want the program to be in the same state it was in when the change was initially done. This is important, because an undo or redo executed on a modified state might be unsafe. But the cursor location is part of the program state. So is, for example, how zoomed in our view of the timeline is, or the audio data currently being queued for playback. Even the mouse position and the size of the program window are stored in the program state. We don’t want undo and redo to depend on those things, and we don’t want changes to those things to be undoable.

So what we really have is a deep state, for which the above statement is true, and a superficial state, for which the requirement doesn’t hold. Undo/redo operate on the deep state, independent of the superficial state. The user operates on the deep state with the aid of the superficial state.

I only thought about the program state this way after implementing undo/redo, and momentarily feared that I would have to fundamentally redesign the program to more rigorously delineate the deep from the superficial state; but I was pleased to find the distinction had more or less arisen organically already, and while making it explicit might be wise, it is not strictly necessary.

Problem: GUI sliders

The undo/redo interface I designed worked well for almost everything, and I was quite content using it (unlike my dreaded Textbox interface, which I must propitiate each time I require its aid). But a problem did arise.

There are two main ways a user can modify the volume of a track:

Select the track, and hold down Shift + “-” (S--) to lower the volume or Shift + “=” (S-=) to raise it.
Click and drag the volume slider on the track.

Track volume being modified first with the keyboard, then the mouse

In my original implementation of GUI sliders, each slider stored a pointer to its target value – in this case, to the float that represents the track volume. When the user clicks on or drags the slider, the new value is calculated based on the mouse position relative to the slider’s width, and the change propagates down to the target value as the slider is reset.

On the other hand, when the user adjusts the volume with the keyboard, the track volume is changed directly by that command, and then the slider’s appearance is updated based on the new value of its target.

So where does the undoable UserEvent get pushed? In the slider code, or the keyboard code? Or both?

I decided it had to be the slider code, since the keyboard code is untouched when the change is done by mouse, but some slider code is run either way. It made me immediately uneasy to consider putting undo/redo logic in GUI component code, but it seemed the best way to achieve something generic that would work with my existing implementation.

So I did that, and it worked fine, and it also meant that any change to any slider would be undoable automatically, without any new code. Kind of nice!

What I did not consider was that some sliders are transient:

This slider, which controls the cutoff frequency of a FIR filter effect, is destroyed (freed) when the tabbed window on which it appears is dismissed. Under the new model, if I made a change to the slider, dismissed the window, and then attempted to undo the change, the program would crash.

I encountered the crash in the course of working on another feature (click tracks), and considered it a distraction. So I came up with a quick fix that involved keeping these sliders around for just a little bit longer, invisibly, until all of the related user events were gone too. I did this basically by implementing reference counts on the sliders.

Any attempt to tweak a memory object’s “destroy” or “free” method such that the memory isn’t actually freed just yet is a canary in the coal mine of bad program design, at least for me. That was certainly the case here, though I no longer remember what finally prompted me to abandon ship. There’s a dead git branch memorializing the ill-fated outing for the morbidly curious.

So how do I actually solve this problem, and effectively uncouple the GUI component from undo/redo logic? And how does the solution result in Jackdaw getting its very own UDP API? That’s the subject of another blog post.

published by charlie on 2025-02-24