Gigaflow
Gigaflow TeleImmersion Project
Contents
Introduction
Networks such as Internet2 are scaling up to astonishing capacities. Where demonstrated real-time, interactive, uncompressed flows have been in the "centi-flow" range for audio and recently support 4k video, the Gigaflow project envisions a near future with several orders of magnitude greater number of interactive channels combining these and other interaction modalities. Collaborative applications explored in our Expedition in Computing will transcend the present state-of-the-art from "almost like being there" to "better than being there." The team is prepared to couple upgrades in raw network power and media fidelity with research in perception, synthesis and prediction.
Aims
Subjects
Gigaflow proposes to examine and implement three work programs (WP) which will be interconnected to form the final Gigaflow high-quality high-definition framework.
WP1 - Emergence
The increasing number of hosts on the network where high-definition acoustical streams are received and scattered to other points constitute nodes on an irregularly-spaced, non-stationary mesh. The expected proliferation of HDIS nodes leads to the advent of an acoustical network with interesting emergent properties as the number of hosts scales up dramatically. A "jam cell" in which remote musicians hear each other exists as part of current practice. An example application is the grouping of seven peers in a many-to-many directly interconnected lattice. In the near future, branching between cells will become common: any nodes can scatter a cell's sound out to a neighboring cell, and all parties become interconnected at one level of remove. The physical and perceptual properties of a multitude of cells propagating sound at various levels of remove is a subject of this expedition in computing. [synchronization]
Several strategies have been implemented to address the well-known problem of delay (latency) in network performance. These include the use of high-speed networks, fast compression algorithms, and artificially increasing the latency by ”one-phase delay” (Ninjam, among others).
Differing amounts of audio delay are acceptable depending on the type of music and the number of performers. Experiences with free improvisation tells us that delays on the order of 100 ~ 200 ms are still acceptable for a good performance, and musicians working in certain genres don't feel it as a hugely inconvenient. On the other hand, delays on the order of 25ms already cause problems for a professional string quartet ensemble playing in classical style.
Visual conducting to synchronize musicians in real spaces doesn't serve the same purpose over the network; audio travels much slower than light in real halls (which is why visual conducting works). In the network scenario, however, audio and video speed are in the best case the same (though present technology actually has audio winning the race). This means that one has the rethink conducting strategies.
We envision two technical fronts that will work to create a better network performance experience: investigating supervisory control and prediction. A supervising conductor will be able to maintain synchronization across a multi-located space. Coupling pattern recognition / prediction and supervisory control techniques, this conductor (which can be the musicians themselves, the machine, or both) will be able to fully explore the musical potential of a given network configuration. In particular, delay configurations will determine the performance outputs which can be influenced by a conductor, dictating for example maximum tempo (understood as the speed of musical events), pattern variability and sound types.