Pedro Venda
Lisbon, 4 July, 2005
This document describes the features of the Advanced Linux Sound Architecture project - an audio API and driver collection for the linux kernel 2.6 - including a case study of installation and configuration complete with a kernel level full duplex mixer.
Copyright (C) 2005 by Pedro Venda
I am an engineering student with professional interest in computer networks, distributed systems and computer security. Also I'm a linux user since 1998.
This work is licensed under the Creative Commons Attribution-ShareAlike License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/2.5/ or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA.
The most recent version of this document can be found on my website http://www.pjvenda.org under the "IT articles" section. It can be browsed online or downloaded in several formats: [dvi] and [pdf]. [not yet available.]
Revision History
Corrections, suggestions, questions are welcome. Feel free to ask questions and point out errors or misleading information.
I can be reached at pjvenda at pjvenda org or at other e-mails listed on my website http://www.pjvenda.org.
ALSA project is Advanced Linux Sound Architecture (http://www.alsa-project.org). Stable and development releases, news and device support lists can be found on the website along with very detailed user and developer documentation, including a wiki for direct community contributions.
The project team proposed to develop a new high-quality linux sound subsystem for the 2.6 kernel, to replace the old OSS (Open Sound System) sound subsystem used in the 2.4 days.
ALSA provides efficient support for all types of audio interfaces. It is a fully modularized SMP and thread-safe library which provides a high-quality user-space API. There is even a backwards compatibility layer with OSS.
Depending on the sound card quality and/or features, some problems may arise. Particularly, full duplex is not well supported by the low level alsa driver and heavily sound-card dependent. Concurrent sound card access by applications can also be tricky.
One of the commonly used approaches to solve these problems is the usage of a sound server. The sound server is an application level mixer daemon which acts as a gateway between applications and the ALSA subsystem. Audio streams from applications are receibed by the sound server, mixed together into a single stream and finally sent into the kernel via ALSA.
The sound server approach has advantages over direct ALSA usage:
On the other hand there are some disadvantages, some of them deriving directly from corresponding advantages:
Known sound servers include aRts, ESD, JACK, NAS, etc, and the bigger desktop environments tend to pick one of them for all their apps. JACK, unlike the other mentioned userspace audio servers, was designed and developed from scratch to be a low latency audio server for professional audio work.
Usage of a sound server can be optional though as ALSA also provides some means of lower level software mixing. (see section dmix and dsnoop enhancements of this document)
ALSA was written for linux operating system. It was supposed to be the de facto standard for the kernel's audio layer, and it is. Linux distributions generally include precompiled kernels with many alsa drivers (to be chosen by the user according to available hardware) and the userspace alsa tools and API headers.
Some Linux distributions make all the work for us and sound just works from the first boot, usually through a sound server+alsa combination. Recent Ubuntu, Debian, Fedora and Mandriva distributions behave as described (very well!).
If your audio hardware is mainstream, there should be absolutely no problem in using it. The distribution sorts everything out.
The complete alsa project tools are divided into several packages:
Too easy? Yes, but sometimes one just needs to know how it is in fact done. Our testcase will be my laptop running Gentoo Linux with an lspci output showing an AC97 intel 801 sound chip.
0000:00:1f.5 Multimedia audio controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97 Audio Controller (rev 03)
Necessary kernel configurations are:
After kernel compilation, installation and boot (if necessary) with new kernel, it is necessary to load the alsa modules:
archon ~ # modprobe snd_intel8x0 archon ~ # lsmod | grep snd snd_pcm_oss 40544 0 snd_mixer_oss 14080 1 snd_pcm_oss snd_seq_oss 24640 0 snd_seq_midi_event 3712 1 snd_seq_oss snd_seq 36752 4 snd_seq_oss,snd_seq_midi_event snd_seq_device 4044 2 snd_seq_oss,snd_seq snd_intel8x0 20416 4 snd_ac97_codec 59768 1 snd_intel8x0 snd_pcm 62792 4 snd_pcm_oss,snd_intel8x0,snd_ac97_codec snd_timer 15748 3 snd_seq,snd_pcm snd 33572 15 snd_pcm_oss,snd[...] soundcore 4256 1 snd snd_page_alloc 4356 2 snd_intel8x0,snd_pcm archon ~ #
After which the soundcard should be recognized by the kernel and the alsa API will be able to communicate with the kernel to send sound streams into the sound chip.
A quick check shows that alsa driver is up and running (1), that there is a recognized soundcard (2) with available input/output devices (3,4):
(1)
archon ~ # cat /proc/asound/version
Advanced Linux Sound Architecture Driver Version 1.0.8 (Thu Jan 13 09:39:32 2005 UTC).
(2)
archon ~ # cat /proc/asound/cards
0 [I82801DBICH4 ]: ICH4 - Intel 82801DB-ICH4
Intel 82801DB-ICH4 with unknown codec at 0xd0000c00, irq 10
(3)
archon ~ # cat /proc/asound/devices
20: [0- 4]: digital audio playback
27: [0- 3]: digital audio capture
26: [0- 2]: digital audio capture
25: [0- 1]: digital audio capture
16: [0- 0]: digital audio playback
24: [0- 0]: digital audio capture
0: [0- 0]: ctl
1: : sequencer
33: : timer
(4)
archon ~ # archon ~ # cat /proc/asound/pcm
00-00: Intel ICH : Intel 82801DB-ICH4 : playback 1 : capture 1
00-01: Intel ICH - MIC ADC : Intel 82801DB-ICH4 - MIC ADC : capture 1
00-02: Intel ICH - MIC2 ADC : Intel 82801DB-ICH4 - MIC2 ADC : capture 1
00-03: Intel ICH - ADC2 : Intel 82801DB-ICH4 - ADC2 : capture 1
00-04: Intel ICH - IEC958 : Intel 82801DB-ICH4 - IEC958 : playback 1
archon ~ #
Using alsamixer (included in alsa-utils package) it is generally necessary to unmute and raise the volume of at least "master" and "PCM" sound channels. After which the first sound tests can begin. Also included in alsa-utils package is aplay - a command-line sound recorder and player for ALSA soundcard driver. Start by double-checking available playback hardware devices (5) and then try to decode some wave file (6). [This will not work with compressed or otherwise encoded files like MP3 or OGG or something else. If there is no wave sound file around, an OGG file can be played through aplay if decoded first (7).]
(5) archon ~ # aplay -l **** List of PLAYBACK Hardware Devices **** card 0: I82801DBICH4 [Intel 82801DB-ICH4], device 0: Intel ICH [Intel 82801DB-ICH4] Subdevices: 0/1 Subdevice #0: subdevice #0 card 0: I82801DBICH4 [Intel 82801DB-ICH4], device 4: Intel ICH - IEC958 [Intel 82801DB-ICH4 - IEC958] Subdevices: 1/1 Subdevice #0: subdevice #0 (6) archon ~ # aplay horsemen.wav Playing WAVE 'horsemen.wav' : Signed 16 bit Little Endian, Rate 44100 Hz, Stereo (7) archon ~ # oggdec -Q -o - horsemen.ogg | aplay Playing WAVE 'stdin' : Signed 16 bit Little Endian, Rate 44100 Hz, Stereo
At this point, sound should be coming out of your speakers or headphones. If not, check for alsamixer configuration (just run alsamixer, and if it doesn't exist, just install alsa-utils), then check for error messages in aplay output and kernel log buffer (use dmesg for that).
There shouldn't be any problem if the correct soundchip alsa driver is being used. Current linux distributions have some init scripts to load the correct kernel modules at boot (maybe even soundchip detection and selective module loading) and to restore previous mixer levels (hopefully saved on last shutdown). It is really that simple.
It seems that some cards support hardware mixing of different sound streams and some other don't (see section sound servers of this document). The cards that don't (appearently like mine) need some alsa tweaks in order to correctly mix audio streams without using audio servers.
The advanced and complete desktop environments such as gnome or kde use their own userspace audio servers to mix playback streams. The desktop applications use an unified sound API of the chosen audio server abstracting themselves from the sound system architecture below (which could be alsa or oss or other lower level sound system). Known advantages and disadvantages of audio server usage are explained in the sound servers section of this document.
However there are some alsa plugins that implement (among other things) a full-duplex kernel space mixing layer. The interesting plugins for this matter are dmix for playback mix and dsnoop for capture mix. Alsa aware applications can use these plugins in a transparently.
The dmix plugin provides direct mixing of multiple streams. For example, without dmix, it was impossible to play 2 audio streams at the same time (on low end soundcards, which don't support hardware mixing). The first audio stream would play correctly while the second waited silently for the first to release the audio device (8).
(8) pjlv@archon ~ $ ogg123 -q promises.ogg & pjlv@archon ~ $ ogg123 -q promises.ogg === Could not load default driver and no driver specified in config file. Exiting. pjlv@archon ~ $
To circumvent such problems, alsa can be configured to use the dmix plugin for playback mix. Alsa configuration can be done in two places: $HOME/.asoundrc for per-user configuration and /etc/asound.conf for system wide effect. For these experiments, I'll use my $HOME/.asoundrc.
alsa configuration file: asoundrc-dmix
# ~/.asoundrc
# soundcard and device to use
pcm.snd_card {
type hw
card 0
device 0
}
# dmix plugin configuration - playback mixer
pcm.pmix {
type dmix
ipc_key 1024 # unique IPC key
slave {
pcm "snd_card"
period_time 0 # reset to the default value
period_size 1024 # in bytes
# buffer_size or periods can be commented
# they both represent the same thing in different values
buffer_size 8192 # in bytes
# periods 128 # INT
rate 44100
}
bindings {
0 0
1 1
}
}
# redirect default PCM device into dmix (pmix) plugin
pcm.!default {
type plug # auto rate conversion plugin
slave.pcm "pmix"
}
# legacy OSS /dev/dsp support, also redirects intp dmix (pmix) plugin
pcm.dsp0 {
type plug
slave.pcm "pmix"
}
# redirect OSS control into used soundcard
ctl.dsp0 {
type plug
slave.pcm "snd_card"
}
# redirect OSS mixer into used soundcard
ctl.mixer0 {
type plug
slave.pcm "snd_card"
}
The above configuration file is a good example on how to configure the dmix plugin. First, note the usage of the hw plugin to set an alias for the soundcard kernel driver (lines 4-8). Then the dmix plugin is configured (lines 11-28) being called "kmixer" and sending its output into the soundcard slave device "snd_card". Next the automatic conversion plugin plug is used to route the default PCM device into the "kmixer" slave PCM device (lines 31-34). The rest of the configuration file is legacy OSS support stuff.
Trying again to play more than one audio stream, dmix comes into action and multiple audio streams play correctly (9).
(9) pjlv@archon ~ $ ogg123 -q promises.ogg & pjlv@archon ~ $ ogg123 -q promises.ogg & pjlv@archon ~ $ ogg123 -q promises.ogg & pjlv@archon ~ $ ogg123 -q promises.ogg & pjlv@archon ~ $
Some very rare applications seem to ignore the dmix plugin, but can be easily fixed by telling them that the alsa device to use is called "kmixer" or "default" which redirects into "kmixer". mplayer is one of those applications.
mplayer configuration file: mplayer-config
# ~/.mplayer/config # Write your default config options here! # old style. up to 1.0_pre4 #ao=alsa1x:default # Or, in the recent mplayer syntax (mplayer-1.0-pre5-r5) ao=alsa:device=default
The dsnoop plugin is able to mix captured streams into one. Unlike the dmix plugin, dsnoop reads the shared capture buffer from more than one client application and mixes them into a single captured stream.
alsa configuration file: asoundrc-dsnoop
# dsnoop plugin configuration - capture mixer
pcm.cmix {
type dsnoop
ipc_key 2048 # unique IPC key
slave.pcm "snd_card"
bindings {
0 0
1 1
}
}
dsnoop configuration is almost the same as dmix, so no explanation necessary.
The real goal would be to have full duplex playback and capture mixing, and that can be done using the asym plugin. It allows the definition of different slave PCMs for capture and playback. Obviously, the playback PCM will be dmixed and the capture PCM will be dsnooped.
alsa configuration file: asoundrc-asym
# ~/.asoundrc
# soundcard and device to use
pcm.snd_card {
type hw
card 0
device 0
}
# dmix plugin configuration - playback mixer
pcm.pmix {
type dmix
ipc_key 1024 # unique IPC key
slave {
pcm "snd_card"
period_time 0 # reset to the default value
period_size 1024 # in bytes
# buffer_size or periods can be commented
# they both represent the same thing in different values
buffer_size 8192 # in bytes
# periods 128 # INT
rate 44100
}
bindings {
0 0
1 1
}
}
# dsnoop plugin configuration - capture mixer
pcm.cmix {
type dsnoop
ipc_key 2048 # unique IPC key
slave.pcm "snd_card"
bindings {
0 0
1 1
}
}
# assimetric assignment of playback and capture plugins
pcm.duplex {
type asym
playback.pcm "pmix"
capture.pcm "cmix"
}
# redirect default PCM device into duplex slave PCM
pcm.!default {
type plug # auto rate conversion plugin
slave.pcm "duplex"
}
# legacy OSS /dev/dsp support, also redirects intp dmix (pmix) plugin
pcm.dsp0 {
type plug
slave.pcm "duplex"
}
# redirect OSS control into used soundcard
ctl.dsp0 {
type plug
slave.pcm "snd_card"
}
# redirect OSS mixer into used soundcard
ctl.mixer0 {
type plug
slave.pcm "snd_card"
}
Creating a "duplex" slave PCM device using the asym plugin, it is possible to define the playback slave PCM as "pmix" and the capture slave PCM as "cmix". The default device is redirected into the "duplex" slave.
This last configuration file is a fairly complete approach for a full duplex kernel level alsa mixer. It is by no means perfect, and reflects the needs of my hardware, so it probably needs do be adapted for other soundcards.
Most open source applications capable of using alsa, handle correctly with the kernel level full duplex mixer. Just to name a few of the well behaving applications:
Some other applications open-source or proprietary are either still developed for OSS or can be used with OSS can still correctly use the mixer, but with some tewaks:
Being a KDE user, my most frequently used applications all work well with the kernel level mixer. KDE itself needs to be compiled with aRts support, only to build some audio enabled components (like knotify). Once running, aRts can be completely ignored and multiple window manager audio streams play perfectly via a small shell script.
audio player shellscript: alsadecode.sh
#!/bin/bash
#
# alsadecode.sh v1.0 2004/11
#
# Copyright (C) 2004 by Pedro Venda pjvenda at arrakis dhis org
#
# Distributed under the terms of the GNU Public License Agreement (GPLv2)
# http://www.fsf.org/licensing/licenses/gpl.html or
# http://www.fsf.org/licensing/licenses/gpl.txt
#
# script for decoding and playing audio files with different programs
# chosen according to the file's extension.
#
# requires ogg123, mpg321 and aplay.
#
case `echo ${1} | sed -re "s/.*\.(.*)$/\1/"` in
ogg)
echo "playing ogg ${1}"
/usr/bin/ogg123 ${1}
;;
mp3|mpg|mpeg)
echo "playing mp3 ${1}"
/usr/bin/mpg321 ${1}
;;
wav|au)
echo "playing wav ${1}"
/usr/bin/aplay ${1}
;;
esac
alsadecode.sh plays mp3 ogg or wav audio files with auto decoding respectively via mpg321 or ogg123. KDE is configured to play all audio streams via alsadecode.sh. Simple isn't it? Yes, it could be cleaner.

This work is licensed under a Creative Commons Attribution 2.5 License.