[Purr Data as a Web App](#inject-purr-data-directly-into-the-web) [Profile and Optimize Purr Data for Realtime Safety](#profile-and-optimize-purr-data-for-realtime-safety) [Terminal REPL](#terminal-repl) [Core Accessibility](#core-accessibility) [Purr Data Message and DSP Profiler](#purr-data-message-and-dsp-profiler) [Streamlining Purr Data GUI-Pd communication](#streamlining-purr-data-gui-pd-communication) [Vintage Platform Audio Emulation Library](#vintage-platform-audio-emulation-library) [Library for Data-Over-Audio Communication](#data-over-audio-messaging) [Interaction with Audio Plugins](#interaction-with-audio-plugins) [JIT-compiled signal graph for the audio engine](#jit-compiled-signal-graph-for-the-audio-engine) [Use ref-counting to handle object lifetimes](#use-ref-counting-to-handle-object-lifetimes) [Visual Diff](#visual-diff) [Encapsulation Ergonomics](#encapsulation-ergonomics) [Fake News Audio Library](#fake-news-audio-library) [Worst of All Possible Worlds Interpreter](#worst-of-all-possible-worlds-interpreter) [Improve Our Monstrously Complex Build System](#improve-our-monstrously-complex-build-system) Inject Purr Data Directly into the Web -------------------------------------- ### Goal Add a WebAssembly target and pure HTML5 GUI framework for running Purr Data as a web app. ### Benefits Installing Purr Data is often the biggest hurdle for newcomers. Running it in a web browser simplifies this process and allows Purr Data to be run on any device, with any architecture, as long as it has a web browser installed. ### Details For the core, we want to produce a WebAssembly binary that can load a patch and produce audio using the Web Audio API. If possible, we want to also leverage Chromium's newer callback-based interface to compute realtime audio as efficiently as possible. For the GUI, we need a framework that can display and edit a Pd patch within a container in a web page. It should be possible to run the core WebAssembly binary with or without the GUI displayed. The challenge with the GUI is that Purr Data currently uses multiple toplevel windows to display a patch and its visible subcanvases, and properties windows for certain widgets. The framework needs to allow viewing all of these in a container within a *single* browser window. ### Challenges There has been some prior art for building a WebAssembly binary for the core audio engine. However, the difficulty will lie in building the GUI framework that has a workable UX within a single browser window. ### Languages C, HTML5, and Emscripten Profile and Optimize Purr Data for Realtime Safety -------------------------------------------------- ### Goal Find the "pain points" in Purr Data's core message dispatcher and audio engine, and optimize the code where possible to improve the realtime scheduling of DSP. ### Details The core DSP algorithms of Purr Data tend to avoid system calls, unnecessary branching, and other calls which would make the performance of the audio process unreliable. However, there are many areas of Purr Data which are not optimized for realtime scheduling. The first order of business would be to profile Purr Data in several areas to see where problems may be. Some likely culprits are the following: * parse time for incoming messages from GUI to the audio/messaging engine * overhead of sending messages to the GUI, especially for visual arrays and drawing instructions for data structures * overhead of sending streams of "motion" messages from GUI to Pd to track mouse position * overhead of walking the linked list of objects in order to find the object under the mouse/mouseclick * overhead of tracking mouse position over a visual array * overhead of calculating bboxes in the audio process as opposed to GUI * overhead of calculating text positioning in audio process as opposed to GUI * overhead of counting UTF-8 code points in audio process to calculate line breaks * overhead of handling incoming/outgoing socket traffic in the same thread as the audio scheduler * probably other areas Once we get a sense of the pain points, we can tackle the problem of how to optimize in a maintainable manner. Some references for this: * [Guilio Moro's work](https://github.com/giuliomoro/purrdata/commit/9dc3223ece79be5f60a6a629450b52a79b9e050c) on using a separate thread for the socket connections from GUI to the core audio engine (plus all the other socket connections like netsend, netreceive, etc.) * Guilio Moro's work on a threaded microsleep for the event loop. * [Guilio Moro's work](https://github.com/giuliomoro/purrdata/commits/simpler-motion) to simplify GUI communication by handling more of the mouse motion/click logic in the GUI. This results in fewer messages from GUI to audio engine, but still requires a linked-list walk in the audio engine to find the relevant object. * Matt Barber's link to a new cosine wave generator algorithm that may be more performant than the current implementation. (Not so important for current performance, but this may become more relevant once we switch to double-precision for block samples.) * Possibility to vectorize DSP algos using SIMD. Also more crude experiments by just hand-unrolling one or two classes when N=64 (i.e., the most common block size) and measuring the performance impact (if any). Note: There may be some overlap with the other profiling idea listed below. Developers for both ideas may therefore benefit by periodically sharing their work with each other. ### Challenges The initial profiling will take some time but isn't particularly challenging. Making changes to the core audio engine, however, will require some knowledge of Linux system interfaces and some of Purr Data's internals. Properly assessing and testing any threading techniques in C is also frought with peril and will require extreme care in order to keep the code maintainable and avoid insidious bugs. ### Languages Basic to advanced shell scripting, C, plus familiarity with profiling tools like gprof and others. Terminal REPL ------------- ### Goal Make a little REPL interface with which the user can interact with Purr Data programs and program state. ### Details Purr Data is being used in situations where the hardware is an embedded device. While the current GUI runs on most common hardware including the RPI, there are situations where it would be more convenient to simply interact using a text interface (locally or over ssh). The user can already communicate with Purr Data's audio engine over a socket connection. So the "read" and "evaluate" part already exists. However, Purr Data does not print a response nor loop in this situation. Additionally, the UX of sending raw messages to Purr Data's interpreter is quite lacking. The syntax for creating new environments and objects was not meant to be used directly. Objects are referenced by index number, and the diagrams themselves must be referenced using hex identifiers. It would be very beneficial to create a REPL UI that is more user-friendly and well-documented/specified. This way Purr Data users can always interact with and create programs easily on any embedded device, even if there is no direct display. (This would also be very handy for debugging purposes.) ### Challenges An initial REPL can be created with the current Purr Data API, but it won't be particularly user-friendly. To achieve that requires more work and an understanding of Purr Data's message dispatching system. ### Languages C, some shell scripting. Core Accessibility ------------------ ### Goal Ensure that Purr Data is accessible by coupling accessibility with the core UX ### Details Especially because Purr Data is a graphical environment, it's important to make sure the core functionality is accessible. Rather than tack on accessibility as an afterthought, Purr Data should have a UX that makes accessibility features a generally useful part of the programming environment. For example: how does one navigate the nodes of a Purr Data diagram? There should be a way to navigate among the nodes and their connections without using the mouse. If we make sure that each element in the diagram is annotated we can tackle accessibility and keyboard navigation at the same time. Thus, a robust keyboard navigation implementation will help make it possible for screen readers to give meaningful information about each node in the graph. Note: there may be some overlap with the REPL idea above, as the REPL could provide a sensible way for a user to traverse the diagram as an alternative to using the GUI. ### Challenges For example, it will be necessary to study the current GUI implementation to figure out how to extend it to add keyboard navigation. It will also be necessary to study pre-existing approaches to making SVG diagrams accessible and study the current state of HTML5 tools that facilitate this. ### Languages Javascript, HTML, CSS. Some basic C knowledge may be required to send a richer set of data about each object from the core to the GUI. However, there is already an interface that can do this-- it just needs to be hooked into the GUI. Purr Data Message and DSP Profiler ---------------------------------- ### Goal Measure the time it takes for each object in a Purr Data diagram to process its data and display the results in the diagram. ### Details Purr Data users would benefit greatly from the ability to profile their programs while they are running. This is easy to do for the program as a whole, but challenging to do per-object. A successful implementation of this feature will give an accurate measure of the time it takes each object to process its incoming data. This needs to support all of Purr Data's platforms: Windows, OSX, and GNU/Linux. A successful implementation will also be performant enough that the measurements themselves don't impact the realtime operation of Purr Data itself. A now defunct fork of Pure Data called ["DesireData"](http://artengine.ca/desiredata/) did an initial implementation of this idea using the x86 RDTSC instruction. (Though its unlikely this feature was actually stable at the time DesireData was in active development.) Though this instruction is no longer considered reliable on modern machines, the overall approach taken by DesireData of adding a field to the t_gobj struct for storing this timing data is probably a sound starting point. Note: There may be overlap with the other profiling idea listed above, as developers on both ideas will probably be using the same tools and can therefore benefit by periodically sharing their work with each other. ### Bonus goal Figure out a way to meaningfully profile DSP objects. DSP objects typically process data at a high sample rate (44,100 is common) so displaying the data in a user-friendly and meaningful way is tricky. ### Challenges This feature touches the main artery of the message dispatching system, and the bonus goal would touch the main DSP routine. In both cases realtime scheduling deadlines must be taken into account by careful profiling. ### Languages C for the profiling business logic, HTML5 for displaying the results in the GUI. Streamlining Purr Data GUI-Pd communication ------------------------------------------ ### Goal Move some of the GUI callbacks out of Purr Data's audio engine so that GUI interaction is less likely to cause dropouts. ### Details The Pd GUI is heavily entangled with the Pd audio backend. In fact, most of the "gestures" performed on the GUI are passed straight to the Pd engine for processing. The GUI gestures are then "analyzed" by the audio thread, which may respond with triggering a GUI action, changing the state of an object, or nothing. For instance, each mouse move triggers a `motion` message to the Pd backend, handled by `canvas_motion()` in `g_editor.h`. This calls `canvas_doclick(... doit = false)`, which in turn iterates through all the objects on the patch and asks each of them "does the cursor happen to be on top of you?" (`canvas_findhitbox()`/`canvas_hitbox()`), calling a callback function (`w_getrectfn()`) for each of those objects. Now, most of the time the cursor is not on an object (or patch cable) and the calls to `w_getrecfn()` have no effect, except for wasting CPU power. There are two notable exceptions: a) when the mouse pointer is on top of an object, or one of its inlets or outlets, or on top of a patch cord, or on top of a GUI object, the mouse pointer may change, plus, e.g.: flickering inlets/outlets. b) some objects use the calls to `w_getrecfn()` to track mouse position (e.g.: [mousestate] from cyclone). The above results in a plethora of CPU cycles being wasted, which may cause dropouts when using small blocksizes and/or embedded platforms. Besides - and perhaps most importantly - it seems the wrong approach that some GUI-specific actions (like the ones at a) above) have to be processed and validated by the audio engine, within the audio thread. We could therefore think of an improvement to the Purr-data architecture, where the GUI stuff (e.g.: point a) above) is delegated uniquely to the GUI, which makes for lower CPU usage and potentially a more responsive GUI. For instance, the GUI could be designed to only send `motion` messages when the mouse is on top of an object and it could send alongside with it the Pd "tag" of the object, so that `w_getrectfn()` can be called only for the relevant object). The optimal approach would involve handling all the graphics effects (in/outlet animation, mouse pointers) directly within the GUI, and only sending `motion` messages when something relevant to the Pd engine is _actually_ happening (e.g.: when connecting objects). Additionally, and looking forward, in order to address point b), objects that need to track mouse position should declare this at initialization and should be kept in a dedicated list, so that the `motion` messages from the GUI can be delivered only to them with minimal CPU waste. An alternative - and probably worse - approach to the problem, which could reduce peak CPU usage, would be for the Pd audio engine to maintain a "rasterized" cached map of the patch (e.g.: by calling `w_getrecfn()` for each object at each pixel). This way, it could simply look up the cached map in response to each `motion` message. The cache could be recomputed in a separate thread every time after a new object or patch cord is created. Threading issues may arise here, in case one of the objects is deleted while the cached map is being built. ### Challenges This project comes with a number of challenges, including: potential threading issues between the engine and the GUI, the necessity to re-write the C code of some objects, providing complete documentation for creators of externals, maintaining - where possible (e.g.: excluding objects that track mouse position) - backwards compatibility with Pd. More details on a previous attempt at addressing the problem can be found [here]( http://disis.music.vt.edu/pipermail/l2ork-dev/2017-June/001383.html). ### Languages Javascript and C. ### Potential Mentor Giulio Moro Vintage Platform Audio Emulation Library ---------------------------------------- ### Goal Create a library with objects that emulate the hardware from old hardware like the atari 2600, NES, and others. ### Details There are a lot of resources online for emulating old hardware. Purr Data would benefit by having a library that provides a consistent interface for objects that take input into an emulation of a piece of hardware and output one or more audio signals. If possible, it would be beneficial if most of the interface could be built as a set of abstractions. That way more developers would be able to understand and improve the library. There is a TIA chip emulator written in C in externals/mmonoplayer that can be used as a starting point. ### Challenges Finding a common interface that makes it easy for users to leverage these classes while at the same time being expressive enough to allow decent control of the chip being emulated. ### Languages Pd (Purr Data is a fork of the software Pure Data-- the visual language itself is usually referred to as Pd.) Also, C. Data Over Audio Messaging ------------------------- ### Goal Create a library that allows two instances of Purr Data to pass data messages to each other using sound as the transmission medium. ### Details Pd messages consist mainly of space-separated numbers and symbols. Semicolons mark the end of a message. Sometimes it would be helpful to be able to pass messages from one instance of Purr Data to another-- especially if each instance is on a different machine in the same room. This is currently done either by setting up socket listener/receiver between the two instances or by leveraging a separate message-passing system outside of Purr Data. Since Purr Data is concerned mainly with analyzing and sythesizing sound, machines running Purr Data typically have a mic and speakers connected to a running instance. If it were possible for the user to simply create objects which send/receive messages by sending audio signals to/from each other it would greatly simplify sending at least small amounts of data between machines. ### Challenges Finding a decent interface for users without relying on a big set of dependencies. ### Languages Pd (Purr Data is a fork of the software Pure Data-- the visual language itself is usually referred to as Pd.) However, the library may also be written in C. Interaction with Audio Plugins ------------------------------ ### Goal Make sure that Purr Data has a well-documented way to accept input from standard audio plugin APIs like VST, LADSP, and LV2. Also make sure Purr Data can be used as a plugin in other environments. ### Details There are multiple audio plugin APIs that aim at seamlessly mixing and matching audio filters, synthesizers, and analysis tools in different languages and applications. Purr Data has some libraries to interface with at least two of these standards (VST and LADSPA). There is also a library for LV2 that Purr Data can leverage. However, not all of these libraries run on all the supported platforms (OSX, Windows, Linux). Purr Data also has all the APIs necessary to act as a plugin itself in other applications. But work must be done to ensure this works properly and that it is properly documented how to do it. ### Challenges There is a lot of pre-existing technology here, but it needs to be tested rigorously on all platforms. ### Languages C. Also, familiarity with shell scripting as well as Gnu make. JIT-compiled Signal Graph for the Audio Engine ---------------------------------------------- ### Goal Leverage LLVM to make a jit-compiler for Purr Data's DSP graph. ### Details Alex Norman has shown that it is possible to build a jit compiler for one of Pd's "workhorse" libraries-- the "expr" library. This library essentially lets users specify a mathematical expression that can operate on both vectors and on the sample level. The current library parses the tokens of the expression and requires a separate function call for each unit. This is similar to the way Pd's DSP graph itself gets executed. The jit compiler produces assembly which has superior performance to the current "expr" library. Extending that process to the entire signal graph (or at least core classes within it) would benefit performance on a greater level. ### Challenges Finding a way to incrementally build up the functionality Since there is pre-existing work done with the jit "expr" library, a good start would be to help test that library and improve it. Once the student gains a mental model of the process the student can begin to extend it to Purr Data's DSP graph itself. ### Languages C++. Knowledge of LLVM will also come in handy here. Use ref-counting to handle object lifetimes ------------------------------------------- ### Goal Use ref-counting to track object lifetimes in a more robust manner. ### Details Currently there are many places in the message dispatching system where an object may disappear before a message bound for it gets dispatched. This, along with dynamic patching and arbitrary patch loading can easily cause Purr Data to crash. Additionally, this limits the possibility for reporting errors from within a given call stack. Currently we have no way to know if any of the objects participating in a given call still exist at the time of an error. This means we're limited to checking after the fact for any signs of mutation at error time, and then simply refusing to trace in that case. That prevents valuable analytical data for the most complex error cases. ### Challenges There aren't currently any tests for the cases where attempting to dereference a pointer to a freed object causes a crash. Also, there are two "stacks"-- one is the messaging system proper. The other is the loading mechanism used to open a file or abstraction. Additionally, Purr Data has a binding system which can dynamically bind or unbind symbols even within a single call stack. ### Languages C. Visual Diff ----------- ### Goal Make a tool to visually show the changes in a Purr Data patch, possibly with animation. ### Details Pd has a source file format that is difficult to read. Furthermore, changes in z-order in the GUI can make changes in the source nearly impossible to convey. This makes it difficult to glean anything of substance by viewing a diff of a Pd file in git. ### Challenges This is a general problem with visual programming languages. There may be some prior art with some languages that attempt to solve the problem. But unlike Pd, those languages were probably built from the ground up to solve that problem, whereas Pd's file format doesn't lend itself to solving that problem. ### Languages C, possibly HTML5. Encapsulation Ergonomics ------------------------ ### Goal Improve Purr Data's techniques for encapsulation in the language. ### Details Pd currently has two basic techniques for organizing diagrams. One is simply to copy part of the diagram and paste it into a subpatch. That part of the diagram becomes hidden inside a new object that appears on the parent window. The other is to use so-called "abstractions", which are other Pd files that are instantiated as an object on the parent. Both cases present problems for developers. For example, the program should be able to take a selection of objects and immediately put it into a subpatch with the relevant inlets and outlets created inside the subpatch. Instead, the user must manually cut, paste, and redraw the connections in the editor. Second, users who want to encapsulate and reuse code must first save a patch to the filesystem in the current Pd path, and then create a new object with the same name as that file (minus the ".pd" extension). This makes proper encapsulation more difficult than simply cutting/pasting subpatches, leading to quick-and-dirty copy/pasta where more robust code reuse through abstractions would have been preferred. ### Challenges Much of Pd's current code assumes that abstractions must be loaded from the file system instead of from some template that already exists in the current patch file. Additionally, a new class must be added to define a patch template that fills the role that ".pd" files currently do for abstractions. Fake News Audio Library ----------------------- ### Goal Create a set of DSP filters which take common misconceptions about digital audio and one or more inputs to generate an output that makes those misconceptions true. ### Details There is a plethora of misconceptions about digital audio running back decades. Everything from the wildly subjective (e.g., compact discs don't sound as "warm" as vinyl records), to the technical (e.g., as the frequency in a digital audio signal approaches Nyquist the accuracy of that signal degrades). Once a workable list of such misconceptions is enumerated and cross-referenced, a library should be created for each to take an arbitrary signal input and generate an output in line with that particular misconception. For example, if the misconception is "digital audio degrades higher frequencies", the higher frequencies in the input should be degraded in the generated output signal. ### Challenges The "misconceived" output for each library must be predictable so that a potential user of the library is able to reason about the features of that library. It must also be sophisticated enough that one who believes the misconception doesn't reject the filter out-of-hand as an unfair exaggeration of their erroneous belief. Worst of All Possible Worlds Interpreter ---------------------------------------- ### Goal Write a small scripting language for Purr Data where the time it takes to compute the output is always the worst case time. ### Challenges BPFe rules here-- user can write loops but not loop indefinitely. User may have conditionals. No unsafe wizardry allowed. ### Bonus Challenge Allow user to specify asyncronous behavior with a single float or int value. E.g., default of 1 means that input will trigger the entire script to run and trigger the output. 0.5 means input will trigger the script to run *half* its operations and return control. A subsequent input will run the second half of the operations and trigger the output. 0.25 means input will trigger a quarter of the operations to run. Etc. Improve Our Monstrously Complex Build System -------------------------------------------- ### Goal * simplify the build system so that it is intelligible to humans, especially new developers. Make it possible to build both the core of Purr Data and an an installer binary in less time than it currently takes. * improve the CI runners so they are easier to set up, maintain, and run * find a way for us to automate our release process ### Challenges The build system uses many recursively-called Makefiles and even downloads a binary of the GUI toolkit using a wrapper script. While Purr Data does have regression testing, we don't currently check for things like making sure all the help documentation got installed correctly, or even that every single external library ships. It's quite dangerous even to make small changes to such complex makefiles, so some testing will need to be implemented to ensure that this project is a success.