|
Hi everyone,
I'm Wim Taymans and I'm working on a new project called PipeWire you might have heard about [1]. I have given some general presentations about it during its various stages of development, some of which are online [2]. PipeWire started as a way to share arbirary multimedia, wich requires vastly different requirements regarding format support, device and memory management than JACK. It wasn't until I started experimenting with audio processing that the design started to gravitate to JACK. And then some of JACKs features became a requirement for PipeWire. The end goal of PipeWire is to interconnect applications and devices through a shared graph in a secure and efficient way. Some of the first applications will be wayland screen sharing and camera sharing with access control for sandboxed applications. It would be great if we could also use this to connect audio apps and devices, possibly unifying the pulseaudio/JACK audio stack. Because the general design is, what I think, now very similar to JACK, many people have been asking me if I'm collaborating with the linux pro-audio community on this in any way at all. I have not but I really want to change that. In this mail I hope to start a conversation about what I'm doing and I hope to get some help and experience from the broader professional audio developers community on how we can make this into something useful for everybody. I've been looking hard at all the things that are out there, including Wayland, JACK, LV2, CRAS, GStreamer, MFT, OMX,.. and have been trying to combine the best ideas of these projects into PipeWire. A new plugin API was designed for hard realtime processing of any media type. PipeWire is LGPL licensed and depends only on a standard c library. It's currently targeting Linux. At the core of the PipeWire design is a graph of processing nodes with arbirary input/output ports. Before processing begins, ports need to be configured with a format and a set of buffers for the data. Buffer data and metadata generally lives in memfd shared memory but can also be dmabuf or anything that can be passed as an fd between processes. There is a lot of flexibility in doing this setup, reusing much of the GStreamer experience there is. This all happens on the main thread, infrequently, not very important for the actual execution of the graph. In the realtime thread (PipeWire currently has 1 main thread and 1 realtime data thread), events from various sources can start push/pull operations in the graph. For the purpose of this mail, the audio sink uses a timerfd to wake up when the alsa buffer fill level is below a threshold. This causes the sink to fetch a buffer from its input port queue and copy it to the alsa ringbuffer. It then issues a pull to fetch more data from all linked peer nodes for which there is nothing queued. These peers will then eventually push another buffer in the sink queue to be picked up in the next pull cycle of the sink. This is somewhat similar to the JACK async scheduling model. In the generic case, PipeWire has to walk upstream in the graph until it finds a node that can produce something (see below how this can be optimized). Scheduling of nodes is, contrary to JACKs (and LADSPA and LV2) single 'process' method, done with 2 methods: process_input and process_ouput. This is done to support more complex plugins that need to decouple input from output and to also support a pull model for plugins. For internal clients, we directly call the methods, for external clients we use an eventfd and a shared ringbuffer to send the right process command to the client. When the external client has finished processing or need to pull, it signals PipeWire, which then wakes up the next clients if needed. This is different from JACK, where a client directly wakes up the peers to avoid a server context switch. JACK can do this because the graph and all client semaphores are shared. PipeWire can't in general for a couple of reaons: 1) you need to bring mixing of arbitrary formats to the clients 2) sandboxed clients should not be trusted with this information and responsability. In some cases it would probably be possible to improve that in the future (see below). This kind of scheduling works well for generic desktop style audio and video. Apps can send buffers of the size of their liking. Bigger buffers means higher latency but less frequent wakeups. The sink wakeup frequency is determined by the smallest buffer size that needs to be mixed. There is an upper limit for the largest amount of data that is mixed in one go to avoid having to do rewinds in alsa and still have reasonable latency when doing volume changes or adding new streams etc. The idea is to make a separate part of the graph dedicated to pro-audio. This part of the graph runs with mono 32bit float sample buffers of a fixed size and samplerate. The nodes running in this part of the graph also need to have a fixed input-output pattern. In this part of the graph, negotiating the format becomes trivial. We can preallocate a fixed size buffer for each port that is used to send/mix data between nodes. Exactly like how JACK works. In this scenario it would be possible to bring some of the graph state to trusted clients so that they can wake up their peers directly. As it turns out, the generic scheduling mechanism simplifies to the JACK way of scheduling and the option to do some optimisations (can directly start push from the sources, bundle process_input/output calls, mixing on ports is simplified by equal buffer sizes, ...) There is a lot more stuff that I can talk about and a lot of things that need to be fleshed out like latency calculations, an equivalent of JACK transport, session management, ... But this mail is already getting long :) I would very much like to hear your ideas, comments, flames, thoughts on this idea. I think I'm at a stage where I can present this to a bigger audience and have enough experience with the matter to have meaningful discussions. PipeWire is currently still in heavy development, many things can and do still change. I'm currently writing a replacement libjack.so[3] that runs jack clients directly on PipeWire (mixing and complicated scheduling doesn't work yet). Hope to hear your comments, Wim Taymans [1] pipewire.org [2] https://www.youtube.com/watch?v=6Xgx7cRoS0M [3] https://github.com/PipeWire/pipewire-jack _______________________________________________ Linux-audio-dev mailing list [hidden email] https://lists.linuxaudio.org/listinfo/linux-audio-dev |
|
On Mon, 19 Feb 2018, Wim Taymans wrote:
> PipeWire started as a way to share arbirary multimedia, wich requires vastly > different requirements regarding format support, device and memory management > than JACK. It wasn't until I started experimenting with audio processing that > the design started to gravitate to JACK. And then some of JACKs features became > a requirement for PipeWire. > > The end goal of PipeWire is to interconnect applications and devices through > a shared graph in a secure and efficient way. Some of the first applications > will be wayland screen sharing and camera sharing with access control for > sandboxed applications. It would be great if we could also use this to connect > audio apps and devices, possibly unifying the pulseaudio/JACK audio stack. By unifying I think you mean both things in one server rather than both making jack work like pulse or pulse work like jack. I have been using jackdbus as my audio server/backend with pulse as a desktop compatability layer for 3 to 4 years now with reasonable success. Jackdbus takes care of all physical audio devices (I have no bluetooth audio devices) with my multitrack audio device (an older ice1712 based delta66) as jack's master and any other devices as clients via zita-ajbridge (with SRC). In general, I don't use devices cannected through SRC for recording but many beginner users have bought "pro USB mics" to start recording and so SRC is "a thing". I run pulse without the alsa, udev and jackdbus-detect modules but do load jacksink/source via script as needed. I use my own script because it allows me to name my pulse ports so that pulse sees a device name rather than just jackd. I do not know the internals of pulseaudio, but have found that pulse will sync to any real device it happens to have access to even though no stream is using that device. This ends meaning that data is transfered to jackd on that devices time schedule rather than jack's with the result of xruns in jackd and even crashes when jackd is put in freerun mode. By running pulse with no alsa modules and no udev module (which auto loads alsa modules when a new device is detected), both of these problems are solved. The one problem I have left is that pulse then has to follow jackd's latency model. This is probably because jackd-sink/source are in the sample category rather than well thought out and finished. As jack's latency goes down (it can be changed while jackd is running), jack's cpu load goes up as expected, but it stays in reasonable limits. However, pulse is forced to follow along and pulse uses more than double the cpu as jack does. Along with this some desktop applications start to fail noticably. Skype is a good example of this because it does actually see some use in the profesional audio world in broadcast application where skype is sometimes used for live remote contribution. (think phone in talk show or even news) In such a case, the local studio may be running totally in jack using something like idjc with skype linked in using pulse bridging. (thankfully asterisk can deal with jack directly and already expects low latency operation so normal phone calls just work) Low latency jack operation is important in an announcer application as monitoring as often done with headphones where a delay of one's own voice may be annoying. So jack needs to run at 5ms or so while skype seems to think 30ms is quite normal (and uses echo cancelation so the talker can't hear their own delayed voice). What this points out is that there are two different requirements that sometimes need to be met at the same time. Pipewire has the advantage of knowing about both uses and being able to deal with them somewhat more gracefully if it chooses to. Desktop needs it's own buffering it seems. Certainly most people who use jack much would have liked to see jack become standard with a pulse like wrapper for desktop. The development energy just wasn't there. > Because the general design is, what I think, now very similar to JACK, many > people have been asking me if I'm collaborating with the linux pro-audio > community on this in any way at all. I have not but I really want to change It does not really matter if pipewire is similar to jack in operation. Jack allows things that some applications require and there are users who do not have pulse on their system at all. so even if pipewire did not allow jack clients to directly connect, jack is still around, still in use and will be for some time. (do not be disapointed when some people choose to remove pipewire in their new installs and replace it with jackd1, they may be vocal, but a small number of people) > that. In this mail I hope to start a conversation about what I'm doing and I > hope to get some help and experience from the broader professional audio > developers community on how we can make this into something useful for > everybody. While I have done some development using the jack API, you will have noticed that most of my points above are from a user POV. > sink queue to be picked up in the next pull cycle of the sink. This is somewhat > similar to the JACK async scheduling model. In the generic case, PipeWire has to There will be some people who will say jack async is not good enough, but they will likely also be those commented on above who will use jackd1 (and only LADSPA plugins). This is not in any way a put down of these people, I think there are uses where a jack only system will remain the best approach. Just as there are still many headless servers with no X or wayland. > The idea is to make a separate part of the graph dedicated to pro-audio. This Thank you, that is absolutely a requirement if you wish to avoid the situation we have now of so many people either hacking pulse to work with jackd, removing pulse, complaining desktop audio is blocked when an application uses alsa directly, etc. What it comes down to, is that profesional audio users will continue to use jackd unless pipewire properly takes care of their use case. Because of where pulse has gone, do expect a "wait and see" from the pro community. There are still a number of people who very vocally tell new proaudio users that the first thing they should do is to remove pulse when in most systems this is not needed. These poor new users are then left with a broken system because they are not able to do all the workarounds needed to get desktop audio to work again. Having people who use proaudio working with you from the start should help keep this from happening. There will still be people against it, but also people for, who are also vocal. A request: it is hard to know exactly how pipewire will work, but one of the requests I hear quite often is being able to deal with pulse clients separately. That is being able to take the output of one pulse client and feed it to a second one. This could be expanded to the jack world. Right now, jack sees pulse as one input and one output by default. This is both good and bad. It is good because, most pulse clients only open a pulse port when they need it. This makes routing connections difficult to make manually. The pulse-jack bridge provides a constant connection a jack client can connect to. This is bad because it is only one connection that combines all pulse audio including desktop alerts etc. Some way of allowing an application on the desktop to request a jack client as if it was an audio device would be a wonderful addition. Also, a way of choosing which port(s) in the jack end of things should be default would be nice. Right now, when pulse auto connects to jack it select system_1 and system_2 for stereo out. On a multi-track card system_9 and system_10 (or any other pair) may be the main audio out for studio monitoring. Ports 9 and 10 just so happen to s/pdif on my audio interface. I have also been overly long, but a replacement audio server affects a lot of things. It is worth while taking the time to get it right. -- Len Ovens www.ovenwerks.net _______________________________________________ Linux-audio-dev mailing list [hidden email] https://lists.linuxaudio.org/listinfo/linux-audio-dev |
|
In reply to this post by Wim Taymans
Greetings,
Wim. Amazing project you have there. I hope you succeed. Len has
covered lots of excellent thoughts. Here are a few more, clearly
intersecting. First
of all, it's a great idea. I'd love to see one layer which could do
all of JACK and pulse. But the pitfalls are many :-) It's worthwhile
to remember that the ALSA people tried a lot of it, the code bits and
configuration settings are still there waiting to be used, it's just
that Pulse and JACK are doing it and more so much more reliably. Second,
the newer JACK+Pulse setup with Cadence controlling it is amazing, a
joy and a simplicity. Kudos extremus (sorry, I am linguistically
challenged). It does cost a bit in JACK DSP (5% on the big BNR hard
server when I tried it), but it works very reliably.And third, I could certainly imagine one layer with three different kinds
of ports: MIDI (using the JACK MIDI API), Pro Audio (using the JACK
audio API), and Desktop Audio (using the Pulse API). All desktop audio
ports behave like Pulse, and are controlled using the Pulse
control APIs, and by default their data is mixed into a default Desktop
Audio hardware output. At the control system level (using JACK for all), Pulse ports look like JACK ports and can be rerouted, but the underlying layer treats them differently, decouples them from the rigid round-robin of JACK. This does not make for a simple system, because there has to be both kinds of ports for the hardware audio, and I'm sure there are a lot more complications which others will think of, and which will emerge as soon as users start trying it! J.E.B. On Mon, Feb 19, 2018 at 2:39 AM, Wim Taymans <[hidden email]> wrote: Hi everyone, -- Hear us at http://ponderworthy.com -- CDs and MP3s now available! Music of compassion; fire, and life!!! On Mon, Feb 19, 2018 at 2:39 AM, Wim Taymans <[hidden email]> wrote: Hi everyone, -- Hear us at http://ponderworthy.com -- CDs and MP3s now available! Music of compassion; fire, and life!!! _______________________________________________ Linux-audio-dev mailing list [hidden email] https://lists.linuxaudio.org/listinfo/linux-audio-dev |
|
In reply to this post by Wim Taymans
On 02/19/2018 09:39 AM, Wim Taymans wrote:
[...] > I would very much like to hear your ideas, comments, flames, thoughts on this > idea. I think I'm at a stage where I can present this to a bigger audience and > have enough experience with the matter to have meaningful discussions. Hi Wim, I think the general lack of enthusiasm about pipewire here is because it does not solve any issues for linux-audio and at best does not introduces new ones. In the past years the most prominent question that I have received is * How can I use all of my USB Mics with Ardour on Linux? * How do I uniquely identify my many MIDI devices? * Why does my audio device not have proper port-names? * Why can't I re-connect my device and resume work? These questions are mostly from Mac or Windows users moving to Linux ... and many of them moving back to MacOS. While it is not impossible to combine multiple devices, it is not a straightforward to set this up. Manging devices uniquely and handling temporarily missing devices is not possible on GNU/Linux AFAIK. If you try to come up with a new system (think pipewire), please copy as many concepts as possible from Mac's CoreAudio. Both pulseaudio and jack had the correct idea to present audio as a service to applications. The server is concerned with device(s) and device settings. However, both fail to abstract multiple devices, map their port uniquely and provide multiple apps to concurrently use those devices for different purposes. The main issue with pulse is that it is a poll API. Also pulseaudio's per device, per port-latency is incorrect (if set at all). JACK on the other hand is too limited: single device, fixed buffersize. jackd also periodically wakes ups the CPU and uses power (even if no client is connected). Browsing around in the pipewire source I see several potential design issues. In particular data format conversions: The nice part about JACK is that uses float as only native format. Also port-memory is shared between application with zero-copy. In pipewire a port can be any data-type including vorbis and worse MIDI is a bolted-on sub-type on an audio port. JACK-MIDI has in the past been criticized most because MIDI was a dedicated type instead of JACK providing generic event-ports. Another conceptual issue that I see with pipewire is that it pushes sync downstream (like gstreamer does), instead of sources being pulled upstream. This in particular will make it hard to compensate for latencies and align outputs. Implementation wise there are plenty of other issues remaining to be discussed, e.g. context-switches, resampling, process-graph,.. but those are not important at this point in time. Cheers! robin _______________________________________________ Linux-audio-dev mailing list [hidden email] https://lists.linuxaudio.org/listinfo/linux-audio-dev |
|
On Mon, 20 Aug 2018 at 01:41, Robin Gareus <[hidden email]> wrote:
> > On 02/19/2018 09:39 AM, Wim Taymans wrote: > [...] > > I would very much like to hear your ideas, comments, flames, thoughts on this > > idea. I think I'm at a stage where I can present this to a bigger audience and > > have enough experience with the matter to have meaningful discussions. > > Hi Wim, Hi Robin, Thanks for taking time to reply. > > I think the general lack of enthusiasm about pipewire here is because it > does not solve any issues for linux-audio and at best does not > introduces new ones. > > In the past years the most prominent question that I have received is > > * How can I use all of my USB Mics with Ardour on Linux? > * How do I uniquely identify my many MIDI devices? > * Why does my audio device not have proper port-names? > * Why can't I re-connect my device and resume work? > > These questions are mostly from Mac or Windows users moving to Linux ... > and many of them moving back to MacOS. > > If you try to come up with a new system (think pipewire), please copy as > many concepts as possible from Mac's CoreAudio. > I heard this before. Device management on linux is still pretty bad. I have not seriously looked at how to solve any of this yet.. > > While it is not impossible to combine multiple devices, it is not a > straightforward to set this up. Manging devices uniquely and handling > temporarily missing devices is not possible on GNU/Linux AFAIK. One of the ideas with PipeWire is to move much of the logic to set up devices and filters to another process. We would like to have the desktops come up with policies for what to connect when and where and implement those. This would also make it possible to do more configuration in the desktop control panels, like rank devices (to select a master device), setup filters for surround sound, bass boost, echo cancellation (the things you can configure in Windows and MacOs). I know there a problems with uniquely identifying devices in Linux that may not make this 100% perfect but we should be able to get to the same state as MacOs or Windows. The logic for combining devices exists and works well (zita-a2j/j2a) I would like to have built-in support for this in PipeWire as soon as 2 devices interact. MacOS has a panel for combining devices, we need something like that too. > > Both pulseaudio and jack had the correct idea to present audio as a > service to applications. The server is concerned with device(s) and > device settings. However, both fail to abstract multiple devices, map > their port uniquely and provide multiple apps to concurrently use those > devices for different purposes. Are you talking about JACK? PulseAudio pretty much has this right, no? > > The main issue with pulse is that it is a poll API. Also pulseaudio's > per device, per port-latency is incorrect (if set at all). What wrong with a poll API? To me PulseAudio has more of an event based API.. Not sure what you mean with the latency being reported incorrectly, latency is dynamic and you can query it, it pretty much gives you access to the read and write pointers of the device.. > JACK on the > other hand is too limited: single device, fixed buffersize. jackd also > periodically wakes ups the CPU and uses power (even if no client is > connected). These are the main points for objecting to JACK as a generic desktop replacement for audio and PulseAudio takes the complete opposite approach. To me, the ideal solution would be to keep the JACK design and remove the above mentioned limitations. > > Browsing around in the pipewire source I see several potential design > issues. > > In particular data format conversions: The nice part about JACK is that > uses float as only native format. Also port-memory is shared between > application with zero-copy. I 100% agree, arbitrary format conversions are not practical. In PipeWire, there are 2 scenarios: 1) exclusive access to a device. You can negotiate any format and buffer layout directly with the device. Very handy for compressed formats, or to get the maximum performance (games). Of course only one app can use the device but this can be allowed in certain cases. 2) non-exclusive access. A dsp module (exclusively) connects to the device that converts to and from the canonical format (float32 mono) and the device format. Clients then either connect with the canonical format (jack clients) or can use the stream API (like CoreAudio's AudioQueue) to play or record data with conversions being handled automatically. So only conversions at the entry and exit points of the graph, everything in between is float32. Port memory in PipeWire is also shared between applications and is pretty much a requirement to do anything related to video. Where JACK has 1 buffer per port allocated in the shared memory, PipeWire can have multiple buffers per (output) port that are all shared between the connected peer ports. The reason for multiple buffers per port is make it possible to implement *more* zero-copy scenarios (delays, keeping reference frames for video encoding, ..) > > In pipewire a port can be any data-type including vorbis and worse MIDI > is a bolted-on sub-type on an audio port. Well, MIDI exists as a format and this is how the format is classified currently in the format description. I guess the concern is how this midi data will be used and what importance it is given in the system as a canonical data format. > > JACK-MIDI has in the past been criticized most because MIDI was a > dedicated type instead of JACK providing generic event-ports. Currently there are different ways of controlling properties and behaviour: - control ports: LADSPA (and other plugins) have control ports, one float to configure a behaviour of the algorithms. This needs a (static) port for each property to control, no direct link between control ports and data ports, no timing information, everything needs to be a float,... - midi ports: CC messages to control behaviour. At least you only need one port for controlling multiple things but you need special framing to add timing info (like JACK) limited to 127 values (or 14 bits by using the +32 hack), no easy way to map the number to a function,... - event ports: custom format to describe events to execute. Mostly fixes the MIDI problems but requires a custom format. I think you are saying that instead of making midi ports a core data type, you would define a custom event stream and ports to control things. At the plugin level this certainly makes sense. Would you not have MIDI ports? Would you convert a midi stream to an event stream? You would still need to map events from the midi device to the plugins. You defined an event stream for LV2 but the use of urid requires a shared type database between clients. How do you see this work? > > Another conceptual issue that I see with pipewire is that it pushes sync > downstream (like gstreamer does), instead of sources being pulled > upstream. This in particular will make it hard to compensate for > latencies and align outputs. This used to be the case a long time ago but not anymore. The graph wakes up because there is data needed in the sink and data available in the source, Each node in the graph is activated in turn based on when there dependencies are finished processing. All nodes know exactly when the graph was woken up, what sample it needs and how much data was still in the device. > > Implementation wise there are plenty of other issues remaining to be > discussed, e.g. context-switches, resampling, process-graph,.. but those > are not important at this point in time. context-switches is a big unresolved one. I currently still go back to the server instead of directly waking up the downstream peers when ready. It's (theorerically) a matter of passing the shared memory with counters and eventfd to the client.. I know the code is not the best quality yet.. Pretty much the result of a lot of experimentation, I hope to improve that as things fall into place. Wim > > Cheers! > robin > _______________________________________________ Linux-audio-dev mailing list [hidden email] https://lists.linuxaudio.org/listinfo/linux-audio-dev |
| Free forum by Nabble | Edit this page |
