Skip to content
Snippets Groups Projects

Medium Sized Project Ideas for 2023

The following are ideas that have been tailored for the 175 hour projects. However, keep in mind we will be happy to accept project ideas that students develop themselves! If you have one, contact us on the mailing list and we will give you feedback.

Profile Purr Data CPU Usage in Realtime

Terminal REPL

Speedy Keyboard Entry Holy Grail

Compatibility layer for Canvas Local Abstractions

Large-Sized Project Ideas for 2023

These are the 350 hour projects. As above, we will be happy to accept project ideas that students develop themselves! If you have one, contact us on the mailing list and we will give you feedback.

Save Us From Our Monstrously Complex Build System

Vintage Platform Audio Emulation Library

Make an Ancient Wish from the 90s Come True

Worst of All Possible Worlds Interpreter

Streamlining Purr Data GUI-Pd communication

JIT-compiled signal graph for the audio engine

Completed Projects From Previous Years

  • Library for Data-Over-Audio Communication. Ishan Kumar Kaler
  • Autocomplete. Gabriela Bittencourt.
  • General Midi Mad Dash. Ishan Kumar Kaler.
  • Canvas-private abstractions, automated encapsulation and abstraction saving. Guillem Bartrina.
  • Web UI Cleanup. Prakhar Agarwal.
  • Purr Data Web App Backend. Zack Lee.
  • Purr Data Web App Frontend. Hugo Carvalho.
  • ASCII Art interpreter. Aayush Surana.
  • Add Double Precision Floating Point Format. Pranay Gupta.

Older Project Ideas

Core Accessibility

Purr Data Message and DSP Profiler

Interaction with Audio Plugins

Use ref-counting to handle object lifetimes

Visual Diff

Encapsulation Ergonomics

Fake News Audio Library

Profile Purr Data CPU Usage in Realtime

Goal

Give users the ability to visually and/or programmatically measure which DSP objects are using the most CPU.

Details

Purr Data has a set of built-in objects for measuring performance for an object or objects which are not doing any DSP computation.

However, for the DSP objects, the user has no easy way to gain insight into how much CPU each DSP object is using. There are a few clever hacks that users can employ to achieve this, but they are eithr time-consuming or obscure.

It would be much better if there were a system-wide feature to give visual feedback to the user so they can find out which DSP objects are the most CPU-hungry.

Challenges

The initial profiling will take some time but isn't particularly challenging. Making changes to the core audio engine, however, will require some knowledge of Linux system interfaces and some of Purr Data's internals.

Languages

C for the backend, and HTML5 for the simple visualizations to show the user the results of performance measurements.

Mentor

Jonathan Wilkes

Terminal REPL

Goal

Make a little REPL interface with which the user can interact with Purr Data programs and program state.

Details

Purr Data is being used in situations where the hardware is an embedded device. While the current GUI runs on most common hardware including the RPI, there are situations where it would be more convenient to simply interact using a text interface (either locally or over ssh).

The user can already communicate with Purr Data's audio engine over a socket connection. So the "read" and "evaluate" part already exists. However, Purr Data does not print a response nor loop in this situation.

Additionally, the UX of sending raw messages to Purr Data's interpreter is quite lacking. The syntax for creating new environments and objects was not meant to be used directly. Objects are referenced by index number, and the diagrams themselves must be referenced using hex identifiers.

It would be very beneficial to create a REPL UI that is more user-friendly and well-documented/specified. This way Purr Data users can always interact with and create programs easily on any embedded device, even if there is no direct display. (This would also be very handy for debugging purposes.)

Challenges

An initial REPL can be created with the current Purr Data API, but it won't be particularly user-friendly. To achieve that requires more work and an understanding of Purr Data's message dispatching system.

Languages

C, some shell scripting.

Mentor

Jonathan Wilkes

Speedy Keyboard Entry Holy Grail

Goal

Design and implement a system of keyboard shortcuts to create Purr Data diagrams that is as ergonomic and fast as Finale's "Speedy Entry" is for drawing music notation.

Details

While Purr Data has some handy features to automate the creation of connection among objects in a diagram, efficiently writing a Purr Data program currently requires substantial use of the mouse. It would be much faster if the user could leverage the keyboard to draw most of the diagram and relegate mouse usage to only edge-cases.

Challenges

Visual programming is still really in its infancy. You'll want to study existing systems like Blender's Node Editor and other prior art.

More challenging is the fact that Purr Data users have essentially resigned themselves to the idea that writing visual programs is repetitive, finicky, and quite tiresome. To create a truly ergonomic approach requires collecting use-cases from them while at the same time probably adding features which solve problems they didn't realize they had.

Languages

A choice of implementing most of this in the frontend, which means HTML5 and vanilla Javascript. Or, implementing it in the backend, which means C.

Vintage Platform Audio Emulation Library

Goal

Create a library with objects that emulate the hardware from old hardware like the atari 2600, NES, and others.

Details

There are a lot of resources online for emulating old hardware. Purr Data would benefit by having a library that provides a consistent interface for objects that take input into an emulation of a piece of hardware and output one or more audio signals.

Challenges

Finding a common interface that makes it easy for users to leverage these classes while at the same time being expressive enough to allow decent control of the chip being emulated.

Languages

Pd (Purr Data is a fork of the software Pure Data-- the visual language itself is usually referred to as Pd.) Also, C, or C++ if desired.

Mentor

Matt Barber, Jonathan Wilkes

Data Over Audio Messaging

Goal

Create a library that allows two instances of Purr Data to pass data messages to each other using sound as the transmission medium.

Details

Pd messages consist mainly of space-separated numbers and symbols. Semicolons mark the end of a message.

Sometimes it would be helpful to be able to pass messages from one instance of Purr Data to another-- especially if each instance is on a different machine in the same room. This is currently done either by setting up socket listener/receiver between the two instances or by leveraging a separate message-passing system outside of Purr Data.

Since Purr Data is concerned mainly with analyzing and sythesizing sound, machines running Purr Data typically have a mic and speakers connected to a running instance. If it were possible for the user to simply create objects which send/receive messages by sending audio signals to/from each other it would greatly simplify sending at least small amounts of data between machines.

Challenges

Finding a decent interface for users without relying on a big set of dependencies.

Bonus Challenge

Send the data in the range of human hearing in a form that is pleasing to the ear. (There is actually some prior art on techniques to do this.)

Languages

Pd (Purr Data is a fork of the software Pure Data-- the visual language itself is usually referred to as Pd.) However, the library may also be written in C, or C++.

Mentor

Matt Barber, Jonathan Wilkes

Improve Our Monstrously Complex Build System

Goal

  • simplify the build system so that it is intelligible to humans, especially new developers. Make it possible to build both the core of Purr Data and an an installer binary in less time than it currently takes.
  • improve the CI runners so they are easier to set up, maintain, and run
  • find a way for us to automate our release process

Challenges

The build system uses many recursively-called Makefiles and even downloads a binary of the GUI toolkit using a wrapper script. While Purr Data does have regression testing, we don't currently check for things like making sure all the help documentation got installed correctly, or even that every single external library ships. It's quite dangerous even to make small changes to such complex makefiles, so some testing will need to be implemented to ensure that this project is a success.

Languages

make, C, familiarity with bash as well as the pecularities of OSX and Windows command line

Difficulty

Moderate. The current makefile and autotools spaghetti is a complete mess, but it's possible to make improvements in parallel and to use the current build system as a reference.

Potential Mentor

Albert Graef, Jonathan Wilkes

Ab Compatibility Layer

Goal

  • make a compability feature to export a patch that uses [ab] to a patch that uses abstractions.

Details

A previous year's project to add [ab] for canvas-private abstractions was a success. This project aims to make it possible to export such patches for compability with Pd Vanilla which does not have this feature.

Bonus Goal

Use an abstraction wrapper named [ab] that works in Pd Vanilla to provide a shim. This way the main patch could remain the same in both versions of Pd.

Difficulty

Moderate to Hard.

Languages

C, possibly a little bit of HTML5/Javascript to add the menu option and event handler.

Mentor

Matt Barber

Ancient Wish

Goal

Fulfill an ancient prophecy as written in a comment from the codebase that probably dates back to the 90s:

   /* once the DSP graph is built, we call this routine to sort it.
    This routine also deletes the graph; later we might want to leave the
    graph around, in case the user is editing the DSP network, to save having
    to recreate it all the time.  But not today.  */

Details

Hidden deep in the source code. Armed only with your trusty grep you must spelunk deep into the source code to retrieve the ancient comment. From there you must climb out, reading the crags and crevasses of the surrounding code to understand the hows and whys of granting the ancient wish.

Challenges

Not for the faint of heart. Most users have (rightly) assumed that the DSP graph cannot be rebuilt in realtime. Additionally, very few users have tried to rebuild the graph on the fly during performance-- live coding is a very small niche of the Pd community. Consequently, there has been very little improvement or thought given to this part of the code over the years.

Languages

ANSI C.

Potential Mentors

Matt Barber, Jonathan Wilkes, Albert Graef

Core Accessibility

Goal

Ensure that Purr Data is accessible by coupling accessibility with the core UX

Details

Especially because Purr Data is a graphical environment, it's important to make sure the core functionality is accessible. Rather than tack on accessibility as an afterthought, Purr Data should have a UX that makes accessibility features a generally useful part of the programming environment.

For example: how does one navigate the nodes of a Purr Data diagram? There should be a way to navigate among the nodes and their connections without using the mouse. If we make sure that each element in the diagram is annotated we can tackle accessibility and keyboard navigation at the same time. Thus, a robust keyboard navigation implementation will help make it possible for screen readers to give meaningful information about each node in the graph.

Note: there may be some overlap with the REPL idea above, as the REPL could provide a sensible way for a user to traverse the diagram as an alternative to using the GUI.

Challenges

For example, it will be necessary to study the current GUI implementation to figure out how to extend it to add keyboard navigation. It will also be necessary to study pre-existing approaches to making SVG diagrams accessible and study the current state of HTML5 tools that facilitate this.

Languages

Javascript, HTML, CSS. Some basic C knowledge may be required to send a richer set of data about each object from the core to the GUI. However, there is already an interface that can do this-- it just needs to be hooked into the GUI.

Mentors

Jonathan Wilkes

Purr Data Message and DSP Profiler

Goal

Measure the time it takes for each object in a Purr Data diagram to process its data and display the results in the diagram.

Details

Purr Data users would benefit greatly from the ability to profile their programs while they are running. This is easy to do for the program as a whole, but challenging to do per-object.

A successful implementation of this feature will give an accurate measure of the time it takes each object to process its incoming data. This needs to support all of Purr Data's platforms: Windows, OSX, and GNU/Linux.

A successful implementation will also be performant enough that the measurements themselves don't impact the realtime operation of Purr Data itself.

Note: There may be overlap with the other profiling idea listed above, as developers on both ideas will probably be using the same tools and can therefore benefit by periodically sharing their work with each other.

Bonus goal

Figure out a way to meaningfully profile DSP objects. DSP objects typically process data at a high sample rate (44,100 is common) so displaying the data in a user-friendly and meaningful way is tricky.

Challenges

This feature touches the main artery of the message dispatching system, and the bonus goal would touch the main DSP routine. In both cases realtime scheduling deadlines must be taken into account by careful profiling.

Difficulty

moderate to hard

Languages

C for the profiling business logic, some basic HTML5 for displaying the results in the GUI.

Mentors

Jonathan Wilkes

Streamlining Purr Data GUI-Pd communication

Goal

Create a simplified API between the GUI and the realtime audio engine so that GUI callbacks don't end up blocking audio.

Details

The audio engine spends too much time doing work that could be handled entirely by the GUI. For example, the GUI sends raw mouseclick coordinates to the backend which then walks a linked list of objects to see if an object was hit. Instead, the GUI could just send the index and address of the actual object which was clicked. Because of this, simply moving the mouse over a window with many objects displayed on it can cause audio dropouts.

Challenges

Every developer who has ever tried to create a sane, realtime-safe API for GUI <-> Backend interaction has either burned out or failed. This is partly due to the fact that GUI interaction and behavior isn't specified, only implemented.

More details on a previous attempt at addressing the problem can be found here.

Languages

Javascript and C.

Difficulty

Hard.

Potential Mentor

Jonathan Wilkes

Interaction with Audio Plugins

Goal

Make sure that Purr Data has a well-documented way to accept input from standard audio plugin APIs like VST, LADSP, and LV2. Also make sure Purr Data can be used as a plugin in other environments.

Details

There are multiple audio plugin APIs that aim at seamlessly mixing and matching audio filters, synthesizers, and analysis tools in different languages and applications.

Purr Data has some libraries to interface with at least two of these standards (VST and LADSPA). There is also a library for LV2 that Purr Data can leverage. However, not all of these libraries run on all the supported platforms (OSX, Windows, Linux).

Purr Data also has all the APIs necessary to act as a plugin itself in other applications. But work must be done to ensure this works properly and that it is properly documented how to do it.

Challenges

There is a lot of pre-existing technology here, but it needs to be tested rigorously on all platforms.

Languages

C. Also, familiarity with shell scripting as well as Gnu make.

JIT-compiled Signal Graph for the Audio Engine

Goal

Leverage LLVM to make a jit-compiler for Purr Data's DSP graph.

Details

Alex Norman has shown that it is possible to build a jit compiler for one of Pd's "workhorse" libraries-- the "expr" library. This library essentially lets users specify a mathematical expression that can operate on both vectors and on the sample level. The current library parses the tokens of the expression and requires a separate function call for each unit. This is similar to the way Pd's DSP graph itself gets executed.

The jit compiler produces assembly which has superior performance to the current "expr" library. Extending that process to the entire signal graph (or at least core classes within it) would benefit performance on a greater level.

Challenges

Finding a way to incrementally build up the functionality

Since there is pre-existing work done with the jit "expr" library, a good start would be to help test that library and improve it. Once the student gains a mental model of the process the student can begin to extend it to Purr Data's DSP graph itself.

Languages

C++. Knowledge of LLVM will also come in handy here.

Use ref-counting to handle object lifetimes

Goal

Use ref-counting to track object lifetimes in a more robust manner.

Details

Currently there are many places in the message dispatching system where an object may disappear before a message bound for it gets dispatched. This, along with dynamic patching and arbitrary patch loading can easily cause Purr Data to crash.

Additionally, this limits the possibility for reporting errors from within a given call stack. Currently we have no way to know if any of the objects participating in a given call still exist at the time of an error. This means we're limited to checking after the fact for any signs of mutation at error time, and then simply refusing to trace in that case. That prevents valuable analytical data for the most complex error cases.

Challenges

There aren't currently any tests for the cases where attempting to dereference a pointer to a freed object causes a crash. Also, there are two "stacks"-- one is the messaging system proper. The other is the loading mechanism used to open a file or abstraction.

Additionally, Purr Data has a binding system which can dynamically bind or unbind symbols even within a single call stack.

Languages

C.

Visual Diff

Goal

Make a tool to visually show the changes in a Purr Data patch, possibly with animation.

Details

Pd has a source file format that is difficult to read. Furthermore, changes in z-order in the GUI can make changes in the source nearly impossible to convey. This makes it difficult to glean anything of substance by viewing a diff of a Pd file in git.

Challenges

This is a general problem with visual programming languages. There may be some prior art with some languages that attempt to solve the problem. But unlike Pd, those languages were probably built from the ground up to solve that problem, whereas Pd's file format doesn't lend itself to solving that problem.

Languages

C, possibly HTML5.

Encapsulation Ergonomics

Goal

Improve Purr Data's techniques for encapsulation in the language.

Details

Pd currently has two basic techniques for organizing diagrams. One is simply to copy part of the diagram and paste it into a subpatch. That part of the diagram becomes hidden inside a new object that appears on the parent window.

The other is to use so-called "abstractions", which are other Pd files that are instantiated as an object on the parent.

Both cases present problems for developers. For example, the program should be able to take a selection of objects and immediately put it into a subpatch with the relevant inlets and outlets created inside the subpatch. Instead, the user must manually cut, paste, and redraw the connections in the editor.

Second, users who want to encapsulate and reuse code must first save a patch to the filesystem in the current Pd path, and then create a new object with the same name as that file (minus the ".pd" extension). This makes proper encapsulation more difficult than simply cutting/pasting subpatches, leading to quick-and-dirty copy/pasta where more robust code reuse through abstractions would have been preferred.

Challenges

Much of Pd's current code assumes that abstractions must be loaded from the file system instead of from some template that already exists in the current patch file.

Additionally, a new class must be added to define a patch template that fills the role that ".pd" files currently do for abstractions.

Fake News Audio Library

Goal

Create a set of DSP filters which take common misconceptions about digital audio and one or more inputs to generate an output that makes those misconceptions true.

Details

There is a plethora of misconceptions about digital audio running back decades. Everything from the wildly subjective (e.g., compact discs don't sound as "warm" as vinyl records), to the technical (e.g., as the frequency in a digital audio signal approaches Nyquist the accuracy of that signal degrades).

Once a workable list of such misconceptions is enumerated and cross-referenced, a library should be created for each to take an arbitrary signal input and generate an output in line with that particular misconception. For example, if the misconception is "digital audio degrades higher frequencies", the higher frequencies in the input should be degraded in the generated output signal.

Challenges

The "misconceived" output for each library must be predictable so that a potential user of the library is able to reason about the features of that library. It must also be sophisticated enough that one who believes the misconception doesn't reject the filter out-of-hand as an unfair exaggeration of their erroneous belief.

Worst of All Possible Worlds Interpreter

Goal

Write a small scripting language for Purr Data where the time it takes to compute the output is always the worst case time.

Challenges

BPFe rules here-- user can write loops but not loop indefinitely. User may have conditionals. No unsafe wizardry allowed.

Bonus Challenge

Allow user to specify asyncronous behavior with a single float or int value.

E.g., default of 1 means that input will trigger the entire script to run and trigger the output.

0.5 means input will trigger the script to run half its operations and return control. A subsequent input will run the second half of the operations and trigger the output.

0.25 means input will trigger a quarter of the operations to run.

Etc.