Table of Contents
recursive toc v1.4

ALSA sound guide

ALSA sound guide

Pedro Venda

Lisbon, 4 July, 2005

This document describes the features of the Advanced Linux Sound Architecture project - an audio API and driver collection for the linux kernel 2.6 - including a case study of installation and configuration complete with a kernel level full duplex mixer.

Index

1. Copyright, acknowledgements, updates and feedback

2. What is ALSA?

ALSA project is Advanced Linux Sound Architecture (http://www.alsa-project.org). Stable and development releases, news and device support lists can be found on the website along with very detailed user and developer documentation, including a wiki for direct community contributions.

The project team proposed to develop a new high-quality linux sound subsystem for the 2.6 kernel, to replace the old OSS (Open Sound System) sound subsystem used in the 2.4 days.

ALSA provides efficient support for all types of audio interfaces. It is a fully modularized SMP and thread-safe library which provides a high-quality user-space API. There is even a backwards compatibility layer with OSS.

3. Features

4. Sound Servers

Depending on the sound card quality and/or features, some problems may arise. Particularly, full duplex is not well supported by the low level alsa driver and heavily sound-card dependent. Concurrent sound card access by applications can also be tricky.

One of the commonly used approaches to solve these problems is the usage of a sound server. The sound server is an application level mixer daemon which acts as a gateway between applications and the ALSA subsystem. Audio streams from applications are receibed by the sound server, mixed together into a single stream and finally sent into the kernel via ALSA.

The sound server approach has advantages over direct ALSA usage:

On the other hand there are some disadvantages, some of them deriving directly from corresponding advantages:

Known sound servers include aRts, ESD, JACK, NAS, etc, and the bigger desktop environments tend to pick one of them for all their apps. JACK, unlike the other mentioned userspace audio servers, was designed and developed from scratch to be a low latency audio server for professional audio work.

Usage of a sound server can be optional though as ALSA also provides some means of lower level software mixing. (see section dmix and dsnoop enhancements of this document)

5. OS Support

ALSA was written for linux operating system. It was supposed to be the de facto standard for the kernel's audio layer, and it is. Linux distributions generally include precompiled kernels with many alsa drivers (to be chosen by the user according to available hardware) and the userspace alsa tools and API headers.

Some Linux distributions make all the work for us and sound just works from the first boot, usually through a sound server+alsa combination. Recent Ubuntu, Debian, Fedora and Mandriva distributions behave as described (very well!).

If your audio hardware is mainstream, there should be absolutely no problem in using it. The distribution sorts everything out.

The complete alsa project tools are divided into several packages:

6. Installation testcase

Too easy? Yes, but sometimes one just needs to know how it is in fact done. Our testcase will be my laptop running Gentoo Linux with an lspci output showing an AC97 intel 801 sound chip.

0000:00:1f.5 Multimedia audio controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97 Audio Controller (rev 03)

Necessary kernel configurations are:

After kernel compilation, installation and boot (if necessary) with new kernel, it is necessary to load the alsa modules:

archon ~ # modprobe snd_intel8x0
archon ~ # lsmod | grep snd
snd_pcm_oss            40544  0 
snd_mixer_oss          14080  1 snd_pcm_oss
snd_seq_oss            24640  0 
snd_seq_midi_event      3712  1 snd_seq_oss
snd_seq                36752  4 snd_seq_oss,snd_seq_midi_event
snd_seq_device          4044  2 snd_seq_oss,snd_seq
snd_intel8x0           20416  4 
snd_ac97_codec         59768  1 snd_intel8x0
snd_pcm                62792  4 snd_pcm_oss,snd_intel8x0,snd_ac97_codec
snd_timer              15748  3 snd_seq,snd_pcm
snd                    33572  15 snd_pcm_oss,snd[...]
soundcore               4256  1 snd
snd_page_alloc          4356  2 snd_intel8x0,snd_pcm
archon ~ # 

After which the soundcard should be recognized by the kernel and the alsa API will be able to communicate with the kernel to send sound streams into the sound chip.

A quick check shows that alsa driver is up and running (1), that there is a recognized soundcard (2) with available input/output devices (3,4):

(1)

archon ~ # cat /proc/asound/version 
Advanced Linux Sound Architecture Driver Version 1.0.8 (Thu Jan 13 09:39:32 2005 UTC).

(2)

archon ~ # cat /proc/asound/cards 
0 [I82801DBICH4   ]: ICH4 - Intel 82801DB-ICH4
                     Intel 82801DB-ICH4 with unknown codec at 0xd0000c00, irq 10

(3)

archon ~ # cat /proc/asound/devices 
 20: [0- 4]: digital audio playback
 27: [0- 3]: digital audio capture
 26: [0- 2]: digital audio capture
 25: [0- 1]: digital audio capture
 16: [0- 0]: digital audio playback
 24: [0- 0]: digital audio capture
  0: [0- 0]: ctl
  1:       : sequencer
 33:       : timer
 
(4)

archon ~ # archon ~ # cat /proc/asound/pcm    
00-00: Intel ICH : Intel 82801DB-ICH4 : playback 1 : capture 1
00-01: Intel ICH - MIC ADC : Intel 82801DB-ICH4 - MIC ADC : capture 1
00-02: Intel ICH - MIC2 ADC : Intel 82801DB-ICH4 - MIC2 ADC : capture 1
00-03: Intel ICH - ADC2 : Intel 82801DB-ICH4 - ADC2 : capture 1
00-04: Intel ICH - IEC958 : Intel 82801DB-ICH4 - IEC958 : playback 1
archon ~ # 

Using alsamixer (included in alsa-utils package) it is generally necessary to unmute and raise the volume of at least "master" and "PCM" sound channels. After which the first sound tests can begin. Also included in alsa-utils package is aplay - a command-line sound recorder and player for ALSA soundcard driver. Start by double-checking available playback hardware devices (5) and then try to decode some wave file (6). [This will not work with compressed or otherwise encoded files like MP3 or OGG or something else. If there is no wave sound file around, an OGG file can be played through aplay if decoded first (7).]

(5)

archon ~ # aplay -l
**** List of PLAYBACK Hardware Devices ****
card 0: I82801DBICH4 [Intel 82801DB-ICH4], device 0: Intel ICH 
[Intel 82801DB-ICH4]
  Subdevices: 0/1
  Subdevice #0: subdevice #0
card 0: I82801DBICH4 [Intel 82801DB-ICH4], device 4: Intel ICH - IEC958 
[Intel 82801DB-ICH4 - IEC958]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
  
(6)

archon ~ # aplay horsemen.wav
Playing WAVE 'horsemen.wav' : Signed 16 bit Little Endian, Rate 44100 Hz, Stereo

(7)

archon ~ # oggdec -Q -o - horsemen.ogg | aplay
Playing WAVE 'stdin' : Signed 16 bit Little Endian, Rate 44100 Hz, Stereo

At this point, sound should be coming out of your speakers or headphones. If not, check for alsamixer configuration (just run alsamixer, and if it doesn't exist, just install alsa-utils), then check for error messages in aplay output and kernel log buffer (use dmesg for that).

There shouldn't be any problem if the correct soundchip alsa driver is being used. Current linux distributions have some init scripts to load the correct kernel modules at boot (maybe even soundchip detection and selective module loading) and to restore previous mixer levels (hopefully saved on last shutdown). It is really that simple.

7. dmix and dsnoop enhancements

It seems that some cards support hardware mixing of different sound streams and some other don't (see section sound servers of this document). The cards that don't (appearently like mine) need some alsa tweaks in order to correctly mix audio streams without using audio servers.

The advanced and complete desktop environments such as gnome or kde use their own userspace audio servers to mix playback streams. The desktop applications use an unified sound API of the chosen audio server abstracting themselves from the sound system architecture below (which could be alsa or oss or other lower level sound system). Known advantages and disadvantages of audio server usage are explained in the sound servers section of this document.

However there are some alsa plugins that implement (among other things) a full-duplex kernel space mixing layer. The interesting plugins for this matter are dmix for playback mix and dsnoop for capture mix. Alsa aware applications can use these plugins in a transparently.

dmix plugin

The dmix plugin provides direct mixing of multiple streams. For example, without dmix, it was impossible to play 2 audio streams at the same time (on low end soundcards, which don't support hardware mixing). The first audio stream would play correctly while the second waited silently for the first to release the audio device (8).

(8)

pjlv@archon ~ $ ogg123 -q promises.ogg &
pjlv@archon ~ $ ogg123 -q promises.ogg 
=== Could not load default driver and no driver specified in config file. Exiting.
pjlv@archon ~ $ 

To circumvent such problems, alsa can be configured to use the dmix plugin for playback mix. Alsa configuration can be done in two places: $HOME/.asoundrc for per-user configuration and /etc/asound.conf for system wide effect. For these experiments, I'll use my $HOME/.asoundrc.

alsa configuration file: asoundrc-dmix

# ~/.asoundrc

# soundcard and device to use
pcm.snd_card {
	type hw
	card 0
	device 0
}

# dmix plugin configuration - playback mixer
pcm.pmix {
	type dmix
	ipc_key 1024 # unique IPC key
	
	slave {
		pcm "snd_card"
		period_time 0 # reset to the default value
		period_size 1024 # in bytes
		# buffer_size or periods can be commented
		# they both represent the same thing in different values
		buffer_size 8192 # in bytes
		# periods 128 # INT
		rate 44100
	}
	bindings {
		0 0
		1 1
	}
}

# redirect default PCM device into dmix (pmix) plugin
pcm.!default {
	type plug # auto rate conversion plugin
	slave.pcm "pmix"
}

# legacy OSS /dev/dsp support, also redirects intp dmix (pmix) plugin
pcm.dsp0 {
	type plug
	slave.pcm "pmix"
}
# redirect OSS control into used soundcard
ctl.dsp0 {
	type plug
	slave.pcm "snd_card"
}
# redirect OSS mixer into used soundcard
ctl.mixer0 {
	type plug
	slave.pcm "snd_card"
}

The above configuration file is a good example on how to configure the dmix plugin. First, note the usage of the hw plugin to set an alias for the soundcard kernel driver (lines 4-8). Then the dmix plugin is configured (lines 11-28) being called "kmixer" and sending its output into the soundcard slave device "snd_card". Next the automatic conversion plugin plug is used to route the default PCM device into the "kmixer" slave PCM device (lines 31-34). The rest of the configuration file is legacy OSS support stuff.

Trying again to play more than one audio stream, dmix comes into action and multiple audio streams play correctly (9).

(9)

pjlv@archon ~ $ ogg123 -q promises.ogg & 
pjlv@archon ~ $ ogg123 -q promises.ogg &
pjlv@archon ~ $ ogg123 -q promises.ogg &
pjlv@archon ~ $ ogg123 -q promises.ogg &
pjlv@archon ~ $ 

Some very rare applications seem to ignore the dmix plugin, but can be easily fixed by telling them that the alsa device to use is called "kmixer" or "default" which redirects into "kmixer". mplayer is one of those applications.

mplayer configuration file: mplayer-config

# ~/.mplayer/config
# Write your default config options here!

# old style. up to 1.0_pre4
#ao=alsa1x:default 

# Or, in the recent mplayer syntax (mplayer-1.0-pre5-r5)
ao=alsa:device=default

dsnoop plugin

The dsnoop plugin is able to mix captured streams into one. Unlike the dmix plugin, dsnoop reads the shared capture buffer from more than one client application and mixes them into a single captured stream.

alsa configuration file: asoundrc-dsnoop

# dsnoop plugin configuration - capture mixer
pcm.cmix {
	type dsnoop
	ipc_key 2048 # unique IPC key
	slave.pcm "snd_card"

	bindings {
		0 0
		1 1
	}
}

dsnoop configuration is almost the same as dmix, so no explanation necessary.

asym plugin

The real goal would be to have full duplex playback and capture mixing, and that can be done using the asym plugin. It allows the definition of different slave PCMs for capture and playback. Obviously, the playback PCM will be dmixed and the capture PCM will be dsnooped.

alsa configuration file: asoundrc-asym

# ~/.asoundrc

# soundcard and device to use
pcm.snd_card {
	type hw
	card 0
	device 0
}

# dmix plugin configuration - playback mixer
pcm.pmix {
	type dmix
	ipc_key 1024 # unique IPC key
	
	slave {
		pcm "snd_card"
		period_time 0 # reset to the default value
		period_size 1024 # in bytes
		# buffer_size or periods can be commented
		# they both represent the same thing in different values
		buffer_size 8192 # in bytes
		# periods 128 # INT
		rate 44100
	}
	bindings {
		0 0
		1 1
	}
}

# dsnoop plugin configuration - capture mixer
pcm.cmix {
	type dsnoop
	ipc_key 2048 # unique IPC key
	slave.pcm "snd_card"

	bindings {
		0 0
		1 1
	}
}

# assimetric assignment of playback and capture plugins
pcm.duplex {
	type asym
	playback.pcm "pmix"
	capture.pcm "cmix"
}

# redirect default PCM device into duplex slave PCM
pcm.!default {
	type plug # auto rate conversion plugin
	slave.pcm "duplex"
}

# legacy OSS /dev/dsp support, also redirects intp dmix (pmix) plugin
pcm.dsp0 {
	type plug
	slave.pcm "duplex"
}
# redirect OSS control into used soundcard
ctl.dsp0 {
	type plug
	slave.pcm "snd_card"
}
# redirect OSS mixer into used soundcard
ctl.mixer0 {
	type plug
	slave.pcm "snd_card"
}

Creating a "duplex" slave PCM device using the asym plugin, it is possible to define the playback slave PCM as "pmix" and the capture slave PCM as "cmix". The default device is redirected into the "duplex" slave.

This last configuration file is a fairly complete approach for a full duplex kernel level alsa mixer. It is by no means perfect, and reflects the needs of my hardware, so it probably needs do be adapted for other soundcards.

8. Application notes

Most open source applications capable of using alsa, handle correctly with the kernel level full duplex mixer. Just to name a few of the well behaving applications:

Some other applications open-source or proprietary are either still developed for OSS or can be used with OSS can still correctly use the mixer, but with some tewaks:

Being a KDE user, my most frequently used applications all work well with the kernel level mixer. KDE itself needs to be compiled with aRts support, only to build some audio enabled components (like knotify). Once running, aRts can be completely ignored and multiple window manager audio streams play perfectly via a small shell script.

audio player shellscript: alsadecode.sh

#!/bin/bash
#
# alsadecode.sh v1.0 2004/11
#
# Copyright (C) 2004 by Pedro Venda pjvenda at arrakis dhis org
#
# Distributed under the terms of the GNU Public License Agreement (GPLv2)
# http://www.fsf.org/licensing/licenses/gpl.html or
# http://www.fsf.org/licensing/licenses/gpl.txt
#
# script for decoding and playing audio files with different programs
# chosen according to the file's extension.
#
# requires ogg123, mpg321 and aplay.
#

case `echo ${1} | sed -re "s/.*\.(.*)$/\1/"` in
	ogg)
		echo "playing ogg ${1}"
		/usr/bin/ogg123 ${1}
	;;
	mp3|mpg|mpeg)
		echo "playing mp3 ${1}"
		/usr/bin/mpg321 ${1}
	;;
	wav|au)
		echo "playing wav ${1}"
		/usr/bin/aplay ${1}
	;;
esac

alsadecode.sh plays mp3 ogg or wav audio files with auto decoding respectively via mpg321 or ogg123. KDE is configured to play all audio streams via alsadecode.sh. Simple isn't it? Yes, it could be cleaner.

9. References

  1. Advanced Linux Sound Architecture
  2. Gentoo Linux ALSA Guide
  3. An Introduction to linux sound systems and APIs
  4. alsa documentation - pcm plugins
  5. alsa dmix howto
  6. How to setup alsa dmix (with dsnoop)
  7. What is asym?
  8. HOWTO ALSA sound mixer a.k.a. dmix (gentoo wiki)

Creative Commons License
This work is licensed under a Creative Commons Attribution 2.5 License.