solarbird | linux filesystem performance help

NEW READERS: IT’S NOT ABOUT THE FILESYSTEM ANYMORE BUT IT’S STILL BROKEN: SEE UPDATES AT BOTTOM OF POST. Addressing filesystem performance only partly fixed it. Thanks!

Since always, I’ve had latency issues on my digital audio workstation, which is running Ubuntu Linux (currently 12.04 LTS) against a Gigabyte motherboard with 4G of RAM and a suitably symmetric four-core processor. CPUs run 20%-ish in use most of the time (and all the time for these purposes), and I never have to swap.

In this configuration, I should be able to get down to around 7ms of buffer time and not get XRUNs (data loss due to buffer overrun) in my audio chain. 14ms if I want to be safe.

In reality, I can’t make it reliably at 74ms, and that has hitches I just have to live with. To get no XRUNs or close to it I have to go up to like 260ms, which is insane. I even tried getting a dedicated root-device USB card – I’ve long assumed it was some sort of USB issue. But no.

With some new tools (latencytop in particular) I have found it. It’s the file system. Specifically, it’s in the ext3′s internal transaction logging. To wit:

EXT3: committing transaction     302.9ms
log_wait_commit                  120.3ms

If I turn off read-time updating, which I tried last night, I get rid of 90% of the XRUNs, because the file system does about 90% less transaction logging to update all those inodes with new times.

But any attempt to write – well, you can guess. Even the pure realtime kernel doesn’t help; I compiled and installed a custom build of one today, but apparently this is still atomic: I get exactly the same behaviour. I may be able to live with that to some degree, because it’s a start-and-stop-of-writes thing, and as long as it doesn’t trigger during writes, I can get by.

But it’s bullshit, and it pisses me off.

I’m currently in progress of updating ext3 to ext4. I’d like to think that would solve it, given ext4′s dramatically better performance, but I have no such assurances at this point. I genuinely thought the realtime kernel might do it.

DO YOU HAVE ANYTHING YOU CAN TELL ME, DEAR INTERNETS? Particularly about filesystem tuning. Because this shouldn’t be happening; it just shouldn’t. Honestly, three tenths of a second to commit a transaction? I’ve been places where that kind of number was reasonable; it was called 1983, and I don’t live there anymore.

Anybody?

THINGS IT IS NOT:

Shared interrupt
This particular hard drive (the previous drive did it too; this one is faster)
ondemand CPU scheduling (i’m running in performance)
this particular USB port or a USB hub or extension cord or any of the sort
bluetooth or other random services (including search)
Corrupt HD
Old technology (it’s SATA; the drive is like six months old)
lack of RT kernel. I built this RT kernel today.
Going to be solved by installing a different operating system. Please don’t.

ETA: I got the ext3 filesystem upgraded to ext4, which made all those above numbers get dramatically smaller, but no further XRUN improvement. So I then disabled journaling, a configuration which outperforms raw ext2 in benchmarks I saw, and the machine is screamingly fast despite the RT kernel…

…and it hasn’t made one goddamn whit of difference in the remaining XRUNs. WTF, computer? WTF.

ETA2 (23:51 18 August): Okay, while screwing with the filesystem did solve many XRUN problems, there are still other XRUNs which are apparently unrelated, most notably, the master-record-enable XRUN. Even moving the project to a tmpfs RAM disk and running from there produced identical results, so I’m concluding this is an entirely separate problem.

I’ve already done pretty much everything there is to do the LinuxMusicians configuration consultation page and my setup actually passes their evaluation script. I should be golden, but I’m not. Help?

ETA3 (0:26 19 August): Every two minutes, right now, with the system mostly idle, I’m getting a burst of XRUNs. On an idle machine. But it is exactly every two minutes. And while Ardour remains on top of Top even when idle (at 10% of CPU and 13.5% of RAM), Xorg pops up just underneath it, and its CPU use spikes.

What does Xorg do every two minutes? Anybody? Seriously I have no idea.

ETA4 (13:19 19 August): ARDOUR 3 TRIGGERS SESSION SAVE EVERY TWO MINUTES BY DEFAULT. Disabling that STOPS the two-minute failures entirely. We’re back to file system adventures. Holy hell. THIS HAPPENS EVEN ON RAMDISK so it’s not filesystem or media specific. What the hell is going on here?

Mirrored from Crime and the Blog of Evil. Come check out our music at:
Bandcamp (full album streaming) | Videos | iTunes | Amazon | CD Baby

Flat | Top-Level Comments Only

From:

dreamatdrew

(For the record, this = ogredrew on that other site ;) )
You specifically mention this particular USB port or a USB hub or extension cord or any of the sort.

Is this an external drive?

solarbird

No; but the sound interface is. It's USB 2.0, a TASCAM US-800. The drive is internal, SATA, on-motherboard controller.

Also, hi!

Ok, was just checking, because a USB drive would cause some major lagtastiness. Unless your system is overloaded (and I HIGHLY doubt that), with the way USB audio works, you wouldn't be getting lag, and it wouldn't show up as your filesystem anyway.

And Hi!

If it's overloaded, I can't figure out how. And it's not showing up as CPU or RAM usage. Or even DSP, tho' that's the closest it gets (at like 53% idle at worst, so worst is a relative thing.)

So, yeah.

So I got the file system set to ext4 and that dramatically improved the logging numbers I was seeing but didn't reduce the XRUNs, and then I disabled logging database entirely and amow basically running Improved ext2 (which actually benchmarks faster than real ext2), and all those sorts of latency notices listed above are gone, and...

...it's made no goddamn difference whatsoever.

god DAMMIT.

In XRUNs, I mean. The thing is flying now, despite the realtime kernel. But. XRUNs on every record start. I'm going to launch all computers into space.

You have any issue posting/sending me a copy of your Xorg log file? might be something showing in there that would help..

Not at all. Here you go:

Section "Device"
Identifier "Configured Video Device"
BusID "PCI:00:02:0"
Driver "Intel"
EndSection

Section "Device"
Identifier "Configured Video Device[2]"
BusID "PCI:03:00:0"
Driver "ati"
EndSection

Section "Monitor"
Identifier "Configured Monitor"
Option "DPMS"
EndSection

Section "Monitor"
Identifier "Configured Monitor[2]"
Option "PreferredMode" "1280x1024_60.00"
Option "DPMS"
EndSection

Section "Screen"
Identifier "Default Screen"
Monitor "Configured Monitor"
Device "Configured Video Device"
EndSection

Section "Screen"
Identifier "Second Screen"
Monitor "Configured Monitor[2]"
Device "Configured Video Device[2]"
Option "AddARGBGLXVisuals" "True"
SubSection "Display"
Depth 24
Modes "1280x1024"
EndSubSection
EndSection

Section "ServerLayout"
Identifier "Layout0"
Screen 0 "Default Screen"
Screen 1 "Second Screen" RightOf "Default Screen"
Option "Xinerama" "on"
Option "RANDR" "on"
Option "BlankTime" "20"
Option "StandbyTime" "20"
Option "SuspendTime" "20"
Option "OffTime" "40"
EndSection

Section "Extensions"
Option "Compsite" "Enable"
EndSection

Um, that is a config file, not a log file. Logfiles should be in /var/log... I'm not sure if current ubu is doing the subdirectory thing or not.

I would've sworn you wrote xorg.conf.

We're pretty sure this is not actually xorg now. We're back to saves. Not I/O in general; specifically session saves. Adding new recordings is a complete nonissue (which is the only way I got through the last session, for that matter).

In the source code, most of save happens in
libs/ardour/session_state.cc

Save works fine when plugins aren't activated but very badly when plugins are deactivated.

Save state calls a lot of things including get_state which includes getting latency data from plugins via get_state which calls add_state which calls eventually latency_compute_run which is the same! in both lv2 and ladspa plugins. This calcutes the latency by actually running the plugin. Not a copy: the actual plugin that's in use.

Most notibly in add_state found here:
libs/ardour/lv2_plugin.cc
libs/ardour/ladspa_plugin.cc

latency_compute_run activates the plugin even if it's already activated (!) then deactivates it on exit (which I guess is stacked somehow because they don't deactivate in Ardour itself) and runs a second thread on the plugin (presumably because how else I guess?).

hypothesis: this is causing the cpu to retrace because of bad prediction or bad hyperthreading. Penalty for this in Intel land is large. The two versions of the active plugin may be continually invalidating each other(!) for the duration of the latency test. It may even be causing the on-chip cache to be being thrown out. This would explain why it stops being an issue when the plugin is not active.

Thoughts?

収束ゾーンでのライフ: Life in the Convergence Zone

cheers, love - the cavalry's queer!

linux filesystem performance help

linux filesystem performance help

I just noticed something

Re: I just noticed something

Re: I just noticed something

Re: I just noticed something

Re: I just noticed something

Re: I just noticed something

Re: Xorg (and thats why new thread)

Re: Xorg (and thats why new thread)

Re: Xorg (and thats why new thread)

Re: Xorg (and thats why new thread)

Re: Xorg (and thats why new thread)

Profile

Links

April 2026

Most Popular Tags

Page Summary

Active Entries