...making Linux just a little more fun!

Tuner Cards - Learning By Looking
By Cherry George Mathew

Abstract:

Hopefully, this article will help people who'd like to consider writing device drivers for tuner cards, and others who're generally interested in how TV Tuner cards work.

1 Amateur Acrobatics.

It's hectic at work today. You have a hundred emails to reply to. There's that quality analysis report to submit this afternoon, a business presentation to prepare for the PR team, and a whole bunch of code to sift through for formatting errors. And then there's that favourite TV program that you can't miss out on by any chance. What do you do ? Switch on that TV tuner card of course. And watch the TV program in a window all by itself at the top right corner of your computer screen. All work and no play indeed! Now you can minimize the video window out of sight whenever the boss decides to take a peek over your shoulder. Or you could have it running full screen and beckon at him to come over and have a look if he's a fan too. ;-) Ah! The vagaries of technology!

The Linux platform supports a good number of tuner cards, as well as web cameras and an assortment of such multimedia devices. And as in every other operating system, the tasks of application programs and the kernel proper, are well demarcated and cut out distinctly. Video4Linux (or V4L), as the technology is called, is still evolving from a draft version 1, to a more robust version 2. On the way, lots of device drivers have been developed, primarily around the brooktree chip-set, but now increasingly around other models as well. Application programmers focus on preparing easy GUI based interfaces for the user, either for watching TV, or recording to disk or decoding and reading teletext and so on and so forth. For TV viewing, tasks such as preparing a window of just the right size on screen, requesting the relevant device driver to fill it in with live video (overlay), resizing the viewing area and asking the device driver to adjust the overlay size accordingly, passing on user requests to tune into a specific channel or to change the input from tuner to AV mode, or simply mute sound - these are responsibilities of the application programmer. The application therefore sits as a front end to the tuner driver, and passes on requests from the user to the driver in a previously agreed upon manner, called an Application Programmers Interface (API).

This is explained in detail later.Device Driver programmers, on the other hand, concentrate on translating user requests as mentioned above, into hardware instructions to the specific tuner card. They also make sure that they communicate with applications using the V4L API. Device drivers therefore, sit in between the hardware and the application, taking commands from them, translating them, and passing them on to the underlying hardware, in machine specific jargon.

Over the next couple of pages, you and I are going to try each others' patience . We're going to show each other, among other things, how TV tuner cards work, what they're made of, what types there are, how to make them work in Linux etc etc etc. I say "show each" other, because in attempting to put this article together, I've had to do a bit of research myself, and that's because of you, dear Reader! This is mutual then; so grab a piece of paper and a pen, sit back, and read on.

Warning: Do not nod off. You're going to have a test afterward.
Keywords: PCI bus, I2C bus, IF (Intermediate Frequency), Video Processor, Frame Buffer, DMA, IRQ.

2 Tuner Cards Undressed.

Alright, lets find out what a TV tuner card looks like. Typically, you'd spy at least three functional chips on board.

2.1 The Tuner module

The tuner "chip", is actually a whole board with all the Radio Frequency Components mounted on it, and nicely wrapped up in silver foil, I mean, protective shielding. Take a look at the diagram. Tuner modules come in distinctive packaging, which often look very much like each other. Your antenna cable goes right into the socket at one end of the tuner module. The job of the tuner module, is to do all the Radio Frequency mixing magic, which tunes into a specific TV programme. Whatever frequency the TV programme be on, it is converted into a pre-determined intermediate frequency (IF). This "pre-determined" frequency is actually a real mess, because of historic (political ?) reasons. Each TV system (eg: PAL, SECAM, NTSC, etc.) has a unique IF. Whatever the IF is, the tuner takes care of one, and only one job - it takes in all the zillions of possible frequencies of radio waves in the universe, and at your command, filters out just the right TV programme for you. In the ''I2C section" 5, we'll find out how you "command" the tuner module to tune into your favourite Sports Channel.

2.2 The Video Processor a.k.a TV decoder

The IF which comes from the tuner module, needs to be decoded, and transformed into a viewable format. This is the job of the Video Processor. Viewable Formats, again, due to historic reasons, come in various shapes and sizes. You've got the plain old bitmap format, palletized and planarized (uh, whatever does that mean ?) VGA format, RGB (for Red Green Blue) format, YUV Format (and its subtle variants) and of course, various proprietary formats. If you're keen at reading between the lines, you might have guessed that the "transformation" mentioned above, includes demodulation and Analog to Digital Conversion - which is the whole point of the TV tuner card anyway. When you watch TV on your Computer Screen, what you're actually looking at is Digitized Video Data from the Video Processor being displayed by your VGA adapter. Right, lets break that up into two steps:

Video Processor Digitizes Video Data and dumps it into the "frame buffer".
VGA adapter fetches Video data from the frame buffer, and displays it on screen.

Before we look at the details of how that happens, we need to understand frame buffers. Frame Buffers are also called video buffers or frame RAM and usually reside on the VGA card ( experts please bear with me and ignore AGP for the moment).

API figure

Any data within the frame buffer, is immediately reflected on the screen. This is the job of the VGA controller. If you want to display something on the screen, all you need to do is to dump some data into the frame buffer. Voila! You can immediately see it on screen. On most platforms, this will involve just a plain memory to memory copy, because the frame buffer is mapped into the physical memory address space, just like any other RAM. However on a system which implements some sort of memory protection, applications may not be allowed direct access to system RAM. In Linux, this is controlled by means of the mmap() system call in conjunction with the /dev/ram device node or the frame buffer device driver. Check the manual page of mmap() for details. Of course, for this to work sensibly, the VGA controller has to agree with you about what you wanted to display, and what you wrote into the frame buffer, and where. This is done by "setting the VGA mode". By setting the VGA "mode", the meaning of every bit of data in the frame ram, is now known to the VGA controller. For example, if the VGA mode is set to "640x480" at 8 bpp. The VGA controller knows two things about the display:

The screen is displayed as 480 rows, each row being made up of 640 horizontal dots (or pixels).
Each dot displayed on the screen is represented by a corresponding byte (8 bits) within the frame buffer. Hence the acronym 8 bpp, which stands for 8 Bits Per Pixel.

Here's another possibility - the pixel format. Every pixel has two properties associated with it, namely brightness and colour. Different methods of representing pixels have evolved over the years. The most popular among them are the RGB format and the YUV format. Explaining each is beyond the scope of our discussion, but the details are trivial and allow us to proceed. A complete description of the our video mode setting would therefore be "640x480" resolution at "8 bpp" depth, in "RGB" format. So we'll need at least 640 x 480 bytes of frame buffer size, to represent one such screen.

API figure

Picture then, the typical tuner card in question. It has been instructed to tune into a particular channel, capture the video data from it pixel by pixel into some digital format (eg: 8 bpp or YUV), and to dump it into RAM. This procedure is called "video capture". Here are a few possibilities of video capture:

If the RAM in question is the video buffer, you can immediately see the TV broadcast on the screen. This procedure is called "video overlay".
If the RAM mentioned here is separate RAM, or system RAM we'll need to cart all the data by DMA, into the frame buffer. DMA stands for Direct Memory Access, and is described in some detail later on, in the section on PCI buses. Once the DMA commences, we can begin to watch TV, and we say we've got "video overlay" working.
Whether system RAM or frame RAM, captured video data can be dumped onto disk. This is called video acquisition. Here too, DMA can be used to speed things up. So we could actually even cut a VCD out of video grabbed via the tuner card. Incidentally, the decision on whether to use DMA to move data to the disk, is the responsibility of the disk device driver, and is completely out of the purview of our discussion.

The tuner module is busy demodulating RF into IF. The video processor has an Analog to Digital Converter, which makes samples out of every pixel, and the samples are assembled into frames within RAM with the help of suitable control signals from the the Video Processor. In this article, we'll consider a very simple video processor as an example - the ITT VPX3224D.

2.3 The Audio Processor

Tuner Cards typically handle sound in two different ways. The first method uses the audio processor to demodulate sound from the IF (IF contains both audio and video information). The audio signal thus obtained is routed to an external audio jack, from where one would need to re-route it to the line input of a separate sound card by means of a suitable external cable. If you're not wealthy enough to own a sound card, the line input of your hi-fi set will do :-).

The second approach is for the audio processor to demodulate sound from the IF, convert it into Digital Samples, and use techniques such as DMA (DMA is explained in the section on "PCI buses") to move these Samples to the sound card via the internal system bus (eg: The PCI bus), and from there, to use the sound card to reconvert the digital samples back to the audio signal. This method is more complicated, but more flexible, as the TV sound levels are controllable on the tuner card itself. The first method can avail of that luxury only by talking to the sound driver of the separate sound card. Either way, let's sum up our requirements, and what is required of us as competent device driver writers for tuner cards.

2.3.1 What is required of us:

We need to provide applications with an Interface of functions, called the Applications Programmers' Interface (API).
The API should provide an interface which hides the details of programming the tuner card hardware from video for Linux applications.
Application requests via the API must be appropriately translated into hardware requests to the tuner hardware.
Hardware requests can be broadly classified as shown below:
- Requests to the tuner module.
  eg: Tune into given frequency, Change IF etc.
- Requests to the video processor.
  eg: Start/Stop video capture, Change mode between Teletext and TV, Set capture buffer location, Change between TV standards ('norms' in tech speak) such as PAL, SECAM, NTSC. etc.
- Requests to the audio processor.
  eg: Mute audio, Switch on/off stereo, Volume settings etc.
- Video Window Control.
  eg: Switch on/off video window, position/resize window, place it under/over other overlapping windows, implement chroma keying or clipping (described in the next section) etc.

In the next section, "What a driver wants", we'll see that a standard hardware independent API is already defined for the Linux kernel. In addition, the kernel manages parts of the API and also manages a /proc tree entry. A /proc tree entry essentially provides on the fly information about registered device drivers to curious applications. This means, that our responsibility as device driver writers is alleviated somewhat, and we don't need to waste time on bookkeeping, which is a drab affair anyway. (Care to explain sprintf() to me ??? :-) )

2.3.2 Our Requirements:

We'd like to have kernel functions to probe the card via the PCI interface.
We'd like to have kernel functions to take over the details of the I2C protocol with which we talk with the chips on the tuner card.
We need DMA functions to tell the kernel to move about video data without lots of overhead (ie, without getting your mouse pointer to take a nap while a frame of video is being fetched).

Hmm.... this leads us to an interesting bit in our driver design - snooping around for tools within the Linux kernel.

3 What a driver wants.

Alan Cox has written an excellent article on the Video For Linux API for capture cards in Linux. It comes with the kernel documentation (Documentation/DocBook/videobook.tmpl)² and covers many issues connected with the Video4Linux API. What it does not cover are details of the tuner capture process. Although attempting to cover details about all varieties of TV capture devices in a single article is impossible, a good share of the tuner cards (I cannot vouch for web cameras, etc, which plug into the USB port) available may be expected to conform to what is presented here.

linux/videodev.h³ is the authoritative reference for the V4L API. We will therefore avoid a detailed description of the V4L API here. Any conceptual details about it may be made out from the document by Alan Cox mentioned above. Moreover the V4L API is an evolving standard. What holds good today, may not be applicable tommorow.

3.1 Talking to the driver.

First, lets take a look at the mechanism involved in communication between application and device driver. If you already know about character devices, this is a repetition, and you may safely skip this topic.

In every Unix system, the /dev subdirectory holds special files called device nodes. Each device node is associated with a specific device number registered in the kernel. In Linux, the video4linux driver is registered as device number 81. By convention, the name of the node associated with this device number is /dev/video0. See (Documentation/devices.txt) for details about numbering device nodes. The node /dev/video0, if nonexistent, may be created with the mknod command from the root shell as shown below:

: root@maverick# mknod /dev/video0 c 81 0

Three simple ways of accessing the driver from user space⁴, are immediately obvious from the above discussion: The open, close and read system calls. If video capture is supported by the driver, the following code snippet must be able to read captured data and dump it into STDOUT. Alas, if you cannot understand programming in the 'C' language, its time to pick up Kerningan's and Richie's ``The C Programming Language'', before you continue reading this document.

------------- Code Snippet ------------

#include <stdio.h>

#include <stdlib.h>

#include <sys/types.h>

#include <sys/stat.h>

#include <fcntl.h>

main(){

int fd;

char *buffer;

/* Lets allocate as big a buffer as we can. */

buffer = malloc(65535);

/* Open the device node for reading */

if((fd = open("/dev/video0", O_RDONLY))<0)

{

fprintf(stderr, "Sorry, error opening device /dev/video0\n");

exit(-1);

}

/* Read until program is killed or device runs out of Data (unlikely). */

while( read(fd, buffer, 65535)) write(0, buffer, 65535);

free(buffer);

}

---------- End of Code Snippet ----------

What stands out, from the above snippet of code, is that device nodes can be accessed, much like any other file. That's just about where the similarities end. Besides open(), read(), write() and seek(), device nodes have a special system call called ioctl(). It is the ioctl call that works all the magic of "Talking to the driver" via the V4L API.

Want to switch on the video display ? Do a

: ioctl(fd, VIDIOCCAPTURE, 1);

Want to mute audio ?

{

: v.flags |= VIDEO_AUDIO_MUTE;
ioctl(fd, VIDIOCSAUDIO, &v);

}

should do the trick, where v is declared

: struct video_audio v;

Please note that all the VIDIOCXXXXX constants, the video_audio structure, etc. mentioned above, are defined in linux/videodev.h, and are strictly V4L1 API specific. Therefore linux/videodev.h needs to be included for the above code snips to be meaningful. If I were you then, the next thing I'd do would be to take a good look at linux/videodev.h

Here are a few functions available to device drivers:

: int video_register_device(struct video_device *vfd, int type, int nr);

Description:
Registers a new driver with minor number 'nr' and type either of VFL_TYPE_GRABBER, VFL_TYPE_VTX, VFL_TYPE_VBI or VFL_TYPE_RADIO. The 'video_device' structure provides details such as the name of the driver. Once a minor number is registered, it is locked and cannot be re-registered by another tuner driver.

This function also creates a new entry in /proc/video/dev/

This entry will have details about the video hardware.
Try:

: cat /proc/video/dev/*

to get a list of entries.

: void video_unregister_device(struct video_device *vfd);

Description:
minor number is freed, and device is unregistered, /proc entry is revoked.

int video_exclusive_open(struct inode *inode, struct file *file);

int video_exclusive_release(struct inode *inode, struct file *file);

int video_usercopy(struct inode *inode, struct file *file, unsigned int cmd, unsigned long arg, int (*func)(struct inode *inode, struct file *file, unsigned int cmd, void *arg));

Description:
video_exclusive_open() is a lock provided by the kernel to make sure that only one open is allowed at a time. This frees the driver from having to deal with re-entry issues such as: What happens if another application opens the same device node for video capture, while video overlay is going on ? video_exclusive_release() is the complimentary function to video_exclusive_open(). video_user_copy() deals with copying data from user space to kernel space and vice versa. It makes sure that adequate kernel memory is available, either from the stack, or via kmalloc() - the kernel memory manager.

4 Talking to the hardware.

What we can do, then, is to focus our energies on writing code to program the tuner hardware to do various things like start capture, switch on sound, copy video data back and forth, etc. Most V4L ioctls boil down to tackling these problems anyway. Finally, when everything is ready, we could go about bridging the latest greatest V4L API with our underlying code. This is standard engineering practice.

--------------- Snippet -------------------

Brigadier to Engineer: "Lieutenant, I want that bridge up and ready by nightfall. "

Engineer: "Uh, that's impossible sir. We need to take measurements on the ground and order the parts from supplies before we can even think of starting to build. That'll take at least a couple of weeks Sir!."

Brigadier: "So there are no struts or screws, no angle bars or I joints, absolutely nothing with you to start work immediately ????

Engineer: "Uh, no sir, I never thought we'd need spare parts at such short notice...."

Sound of Gunshot.

End of Take 1.

--------------- End of Snippet ----------------

Let's begin building the parts.
The device driver functionality we provide may be broadly classified into two - Video Acquisition, and Video Display.

4.0.1 Video Acquisition.

One part of the driver is concerned with acquisition of video data, ensuring that the tuner module is properly tuned in, that the video processor is decoding the correct standard (eg: PAL, NTSC etc.), that picture properties such as brightness, hue, saturation and others supported by the video processor hardware is adjusted, properly fine tuned or set to default values. Sound Acquisition can also be the responsibility of this part of the driver. These are described in detail in the section on I2C.

4.0.2 Video Display.

The other part of the driver is concerned with making sure that the acquired data is displayed properly on the screen. This part of the driver has to ensure that if video is viewed in a window, overlapping issues with windows of other applications are handled correctly. Details of parameters which get affected when the video window is resized or dragged to another location, such as pitch of the video window, number of lines acquired, number of pixels acquired etc are the responsibility of this section of the driver. Lets take a look at the window overlap problem, in more detail. In a windowing environment such as Xwindows, video overlay needs to be implemented in a window. The overlap problem begins the moment a corner of another application window overlaps a part of the video window.

There are two options here:

Tell the windowing environment that the video overlay window is to be King. No other window may overlap it. Overlapping windows beware! This is a very clumsy option, and is taken to as a last resort, when no other methods are available.
Explicitly avoid overwriting corners which have been overlapped, with live video. Overlapping corners, are called clips in Video4linux jargon.

There are two approaches to not overwrite overlapped corners of the window.
1. Avoid overwriting overlapped areas with video data. This is accomplished in either of two ways:
  1. Clip Lists: Some video processors support entering a list of coordinates, called a clip list, into hardware, which basically prevents them from overwriting frame buffer regions specified by those coordinates.
  2. Chroma keying: All regions within the frame buffer corresponding to regions on screen which may be overwritten, are filled with a specific colour value called a chroma key. When writing acquired video data into the buffer, the video processor looks for the chroma key, makes a comparison, and overwrites the buffer only if there is a match. Overlapped areas are not written with the chroma key, and are therefore spared from being overwritten with video data.
  Both these methods work when the card captures video directly to the frame buffer.
  Here's a question for you. Whom do you think, fills up the buffer with the chroma key ?
  
  Lookout for the answer at the end of this section.
2. Arrange for video data to be displayed by the Xserver, by writing into Xserver buffers instead of the frame buffer:
  This allows the Xserver to handle overlapping issues. Be warned that this is a very tricky and slow method, as the Xserver is very slow at displaying real-time video and synchronizing buffer accesses between the tuner card hardware and the Xserver program is impossible. Expect overlapping frames and jerky pictures.

Introducing the Pixelview Combo TV plus.

What we can do then, is to begin writing routines which do little things like setting the chroma key, setting the size of the video window, positioning the window properly, etc. The best way to learn such things is by example. We'll base our study on a few code snippets from my unofficial and partly working driver for the Pixelview Combo TV plus. This is a simple card, as simple as tuner cards can get to be. The Tuner Module, video processor and VGA controller, all sit on the same card. This card is plugged into the PCI slot, and doubles both as a tuner card, and as a VGA display card.

Card Description:

Tuner Module - Phillips FM1216ME MK3
Video Processor - VPX 3225D
VGA Controller - Cirrus Logic GD-5446 with 2MB RAM on board.
Sound Demodulation - Phillips TEA5582
Sound Switch controlled by: One pin from the VPX 3225D

Since we're interested in the Video Display right now, we'll focus our attention on the Cirrus Logic GD-5446 VGA controller. The GD-5446 has a special feature. You can specify a certain region within the frame buffer itself, to contain video data which will be displayed inside a hardware implemented video window. Let's call this buffer the video buffer.

API figure

The video buffer may be located anywhere within the frame buffer, but typically, it is located at the end of the frame buffer. This keeps captured video data samples from overwriting graphics samples that were already present in the frame buffer and vice-versa.

Let us illustrate with an example:

[

Frame buffer size = 2MB

]
[

Display mode = 640x480 @ 16bpp.

]
[

Total memory required for VGA display = 640 x 480 x 2 bytes

= 614400 bytes

= 0.59 MB

]
[

Unused Memory at the end of the Frame buffer = 2MB - 0.59MB

= 1.41 MB

]

Therefore, we may safely specify that the video buffer begin at an offset of about 0.6 MB into the frame buffer, and that its size not exceed 1.4 MB. Until the hardware video window is switched on, the contents of the video buffer are not visible on screen. The only way this rule is broken, is when the video buffer is set to overlap with parts of the frame buffer which are displayed as graphics. For example, if the video buffer offset is set at 0.5MB in the illustration above, captured video data will interfere with the lower part of the screen, even when the hardware window is off.

The hardware window interprets and displays data within its jurisdiction, entirely differently from what the VGA mode dictates. The size and location of this video window, can be changed by programming relevant VGA registers. The GD-5446 has three sets of registers namely: control registers , graphics registers, and sequence registers . Each of these VGA registers is accessed by multiple reads and writes to hardware ports, and are hence encapsulated in specialized functions. I've named them gd_read_cr(), gd_write_cr() and so on. This improves readability of the code, and reduces the chances of error. Here are a few routines from my driver. I've stripped them down for brevity:

#define GD_SR_OFFSET 0x3c4

#define GD_GR_OFFSET 0x3ce

#define GD_CR_OFFSET 0x3d4

/* Adapter - Low level functions */

unsigned gd_read_cr(, unsigned reg){

unsigned value;

io_writeb(reg, gd_io_base + GD_CR_OFFSET);

value = io_readb(gd_io_base + GD_CR_OFFSET + 1);

return value;

}

Notice that a single access to a VGA register consists of a write to a hardware io port,

: io_writeb(reg, gd_io_base + GD_CR_OFFSET);

followed by a read from an adjacent port.

: value = io_readb(gd_io_base + GD_CR_OFFSET + 1);

Subsequent functions are built up using variants of gd_read_cr();

Here are a few higher level functions

: /* VGA hardware video programming functions. */
void gd_enable_window();

Enables the hardware video window.

: void gd_disable_window();

Disables the hardware video window.

: void gd_set_vbuf1(,);

Sets the location within the frame buffer, where captured video must be written.

: void gd_set_vbuf2(,);

There are two such buffers.

: unsigned long gd_get_vbuf1();

Gets the location of the current capture buffer within the frame buffer. This function compliments gd_set_vbuf1();

: unsigned long gd_get_vbuf2();

See above.

: void gd_set_pitch(,);

Sets the number of pixels that a line of captured _video_ data is made up of. Since the size of the video window is variable, the pitch will have to be reset whenever the window width is changed.

: unsigned long gd_get_pitch();

Gets the current pitch value.

/* VGA video window functions */

static void gd_set_window(,,,);

Sets the coordinates of the hardware window with respect to the main screen. The coordinates are passed on in pointers to structures. See the file (pvcl.h) for details.

: static void gd_get_window(,,);

Gets the current dimensions of the hardware video window. These are read from hardware registers. Let's see the contents of just one routine, to go one step further into the details:

void gd_set_pitch(

struct clgd54xx_card * card_p, unsigned long offset)

{

unsigned long CR3C, CR3D;

CR3C = gd_read_cr(card_p, 0x3c);

CR3D = gd_read_cr(card_p, 0x3d);

/* CR3C[5] = offset[11], CR3D = offset[10:3]*/

gd_bit_copy(&CR3C, 5, &offset, 11, 11);

gd_bit_copy(&CR3D, 0, &offset, 3, 10);

gd_write_cr(card_p, CR3C, 0x3c);

gd_write_cr(card_p, CR3D, 0x3d);

}

Notice the functions gd_bit_copy() and gd_write_cr() ? They're the functions that wiggle the VGA registers. gd_bit_copy() alters specific bits in a specified variable. That variable can later be written to a VGA register using, for example, gd_write_cr(). Since each bit in a VGA register is very important and needs to be handled with care, I thought that a function to tackle VGA registers bit by bit might be in order.

gd_write_cr() is used to write a value into a specified VGA register. Please ignore the variable card_p for the moment. It is a structure where global state information about the driver is stored. card_p is use by gd_write_cr for book keeping purposes only. gd_write_cr(card_p, CR3C, 0x3c) will write the contents of the variable CR3C into the control register 0x3c. (don't be fooled by the name CR3C, its as much a variable as 'unsigned long foo' is.)

In the general case of a tuner card, where the VGA controller does not provide a separate hardware video window, the video processor will have to dump frames right into the middle of the graphics data. This will have to be done in such a way that when the VGA controller displays the new contents of the frame buffer, the video frame must appear correctly, and not skewed. This requires aligning the video data on pixel boundaries (every byte for 8bpp, every other byte for 16bpp, every four bytes for 32bpp, etc.). Besides that, the pixel representation within the video processor must match that of the current mode of the VGA controller. The video processor cannot acquire video at 32bpp and dump it into a 16bpp frame buffer. Also, video data cannot be overlaid in a linearly continuous fashion. The buffer offset of every line will have to be calculated as shown in the figure below:

API figure

Video Buffer Offset = Video Buffer Offset + Video Window Pitch x Line No.

In other words, all the precautions and calculations that the Xserver makes while drawing an application window, need to be taken by the video processor. Here, the video processor writes directly into the graphics buffer, and there is no distinction between video data and graphics data.

However, in the case of the GD-5446, the video processor does not write into the graphics area, and need not worry about alignment issues. All that the video processor routines need to ensure, is that video gets captured into the correct offset within the frame buffer, where the video buffer starts. The gd_set_vbuf1() routine takes care of that for us. The windowing details are then taken care of by the GD-5446 hardware.

For detailed descriptions of GD5446 hardware registers, take a look at the GD-5446 Technical Reference Manual.

An IOCTL walk-through

Its time now for a guided tour of an IOCTL call. Consider that instant of time at which a video4linux application, such as xawtv (see: http://bytesex.org), calls ioctl() to switch on the TV window.

Application calls the ioctl() system call. The ioctl() system call is translated by the c library (glibc, in the case of GNU/Linux), into an assembly language instruction which jumps into a kernel routine.
- Context: Entering a kernel routine implicitly involves a switch from User Mode to Kernel Mode. Linux is a non-pre-emptible kernel, and until the device driver relinquishes control by a call to schedule(), it is running in the context of the process that called it. (Remember, Linux is a multitasking OS, and there is more than one process (or application) running at the same time.) This means that any reference to the ``current process'' would imply, the process which caused the device driver to be called.
- Environment: While in kernel mode, the kernel stack is in use, and kernel functions are available. User space address mappings are untouched, and the file node structure used to access the driver is also available. These properties may be used to save state information on a per process basis, but in our case, since we only allow one process to access the driver, it is safe to save state information in global variables.
The kernel stub routines identify that ioctl() has been called, and pass on the request to the VFS (Virtual File Switch) layer.
The VFS determines that the called node is a device driver, looks up the registration number, and discovers that the ioctl is meant for the Video 4 Linux driver. ( Remember the major and minor numbers ? They're 81 and 0.)
The V4L driver looks for registered candidates, and discovers that pvcl.c has registered a file-operations structure with it, by means of the video_register_device() function call. We've specified that pvcl_ioctl() is to be called, in the case of a V4L ioctl call.
pvcl_ioctl() is our function, available in pvcl.c, and parses the IOCTL parameter. It discovers, through a series of switch(); case: statements, that the video window is to be turned on. So it calls gd_enable_window()
gd_enable_window() calls various VGA register write/read functions, such as gd_read_cr() and gd_write_cr(), and programs the hardware video window to be switched on.
That's it folks!!!

Answer to Chroma key Question:

The application queries the device driver for available chroma keys, and selects one. It then proceeds to fill in the background of the video window with that single colour. Overlaps are then allowed to be painted over the application window, and the video capture is then turned on. Naturally, only the non overlapping regions, ( which are filled with the chroma key background ), are filled in with video data.

5 The I2C bus.

The GD-5446 has two interesting features, as far as tuner capture is concerned. It has an I2C bus interface via two pins, and a Video Port interface via 16 pins. The video port interface follows the ITU-656 standard for exchange of video data. Don't get scared here: Remember that pixels can be made up of more than one byte ? eg: 16 bpp equals two bytes. Well, somebody needed to tell chip manufacturers that in the case of multiple bytes per pixel, transmissions between chips needed to be done in a certain order. Take the case of YUV. Y stands for brightness, U and V stand for the two colour components of a pixel. Let each component occupy 1 byte (this is not true in real life YUV 4:2:2 format, but what the heck, let's illustrate to taste.). One pixel therefore requires 3 bytes, ie; 24 bits. Here's the deal: If you're a chip manufacturer, and you want to boast of an extra incomprehensible line in your features list (to grab the attention of potential customers, of course), consider the ITU-656 seal. But be-warned - once you're sealed, the spirit of the beast is upon your chip. Video gets transmitted only in a particular order: U-Y-V. And here's the good news: The VPX 3225D is part of the brotherhood! Ah, so now it all falls in place. The VGA controller and the Video Processor have a clandestine path of communication, via something called the VPort. And here's further good news: the VPX 3225D has an I2C bus as well! Surprise Surprise !
Ahem, alright, lets sober down a bit and figure out what this means:

The GD-5446 VGA controller has an I2C bus, directly controllable through one of its programmable registers (SR 8, in this case).
The VPX-3225D is connected to the same bus and can therefore chat with the GD-5446 in I2C speak.
Furthermore, they're both connected via a private bus line - the VPort interface, a high speed data bus to transfer video data from video processor to VGA controller. ie; the VPX-3225D can transfer captured video via the VPort bus, to the GD-5446, and this transfer can be controlled via the I2C bus.
Notice here, that the video processor has a private bus to write into the frame buffer of the GD-5446. This bus is on the combo card itself, and therefore bypasses the PCI bus, and even the system processor. All synchronization and handshaking is done between the GD-5446 and the VPX 3225D. The only way to access this bus from the device driver, is indirectly via the GD-5446 SR8 (sequence register number 8), via the I2C bus, via the video processor. Once transfers begin, ie; once video capture begins, the video processor is furiously writing into the GD-5446 frame buffer via the VPort, and accepting instructions from the GD-5446 via the I2C bus. Let's find out more about the I2C bus, before we proceed.

The I2C bus has two lines - SDA and SCL. More than two chips may be connected to the I2C bus, at the same time. However, only one chip can talk over the I2C bus at a time. Fair enough. Chips are divided into two types: Master and Slave. Masters can talk to slaves anytime they like. Slaves may not talk to Masters unless asked to. It only follows from logic, that there can only be one master at a time, on the I2C bus.

Quiz time again:

Identify the master chip on the I2C bus of our Pixelview tuner card.

Let's take a look at SDA and SCL, the two I2C pins:

The SDA pin is the data pin. The SCL pin is the clock pin. The SDA pin may be driven either by the master or the slave, depending on the direction of data transfer. The SCL pin is driven exclusively by the master.

5.1 The Linux I2C subsystem.

As Linux device driver writers, we're quite lucky. Most of the low level, pin level details are handled for us by the kernel. What we need to do is to plug in certain helper routines into the kernel. These helper routines allow the kernel to talk to the I2C bus on our tuner card. Helper routines are like sports car drivers on a cross country rally. Not only do Rally drivers know how to drive really well, they also know their cars in and out - whether its time to change the oil, or whether that front right shock absorber is leaking fluid, or when the clutch plate is close to tatters - little things like that; if there is a problem, the driver knows about it in a jiffy. The navigator, on the other hand knows the terrain, and the race route like the back of his hand. So seconds before the next hairpin curve, he shouts "one hard left coming up!", and the driver shifts down a gear, caresses the brake pedal, does a double twist on the steering wheel - and that's one less hair pin to take. Similarly, the kernel here knows the I2C protocol, and knows when the SDA and SCL pins need to be wiggled. The kernel barks orders to the helper functions, who do the actual wiggling. In order for the kernel to talk to helper functions, they need to be registered with the kernel. The kernel provides a registration function for this: i2c_bit_add_bus(). We pass it a structure defined so in linux/i2c-algo-bit.h :

struct i2c_algo_bit_data {

void *data; /* private data for lowlevel routines*/

void (*setsda) (void *data, int state);

void (*setscl) (void *data, int state);

int (*getsda) (void *data);

int (*getscl) (void *data);

/* local settings */

int udelay;

int mdelay;

int timeout;

};

You guessed it right, the setsda, setscl, getsda and getscl pointer variables are pointer variables to helper functions we provide. Now, each time the SDA pin is to be set high or low, the kernel calls setsda(). If setsda = gd54xx_setsda, then our routine, with the read/writes to the CL-GD5446 SR8 VGA register, would be called. So here's what we do:

#include <linux/i2c-algo-bit.h>

struct i2c_algo_bit_data gd_bus;

gd_bus.setsda = gd54xx_setsda;

gd_bus.setscl = gd54xx_setscl;

gd_bus.getsda = gd54xx_getsda;

gd_bus.getscl = gd54xx_getscl;

udelay = 16;

mdelay = 10;

timeout = 200;

i2c_bus_add(&gd_bus);

The udelay, mdelay and timeout variables are the only direct hold we have on the I2C bus timings, when the kernel drives the I2C pins. Of course, what's given above is pseudo code and won't work directly. Certain details have been omitted, but will be made clear in the following paragraphs.

Let me refer you to documents in the ('Documentation/i2c/') subdirectory for comprehensive details on the I2C implementation within the kernel. In particular, ('Documentation/i2c/writing-clients') is a very nicely written intro on writing I2C drivers.

Answer to quiz:

The GD-5446.

The kernel implements access to a few I2C master chips as well as a direct interface to the SDA and SCL pins. This interface is called the bit bang interface. In the case of the Pixelview Combo TV plus tuner card, we have direct access to the SDA and SCL pins of the I2C bus via SR8 of the GD-5446 VGA controller. SR8 is accessible via hardware ports 0x3c4 and 0x3c5. I've done these accesses using the gd_read_sr() and gd_write_sr() routines. Refer to (pvcl.c). Here's a description of the I2C control register, SR 8, of the GD5446:

I/O Port Address: 3C5h

Index: 08h

Bit Description

7 I2C SDA Readback

6 I2C Configuration

5 Reserved

4 Reserved

3 Reserved

2 I2C SCL Readback

1 I2C Data (SDA) Out

0 I2C Clock (SCL) Out

I/O Port Address:	3C5h
Index:	08h
Bit	Description
7	I2C SDA Readback
6	I2C Configuration
5	Reserved
4	Reserved
3	Reserved
2	I2C SCL Readback
1	I2C Data (SDA) Out
0	I2C Clock (SCL) Out

Whenever one of the I2C bits within SR8 register is wiggled, it is reflected on the I2C bus and all slaves see the change. For example, if bit 1 of SR8 is set to 0, the GD-5446 pulls the SDA line low. If bit 0 of SR8 is set to 1, the GD-5446 pulls up the SCL line. Time to look at set_sda() and get_sda(). As usual, these two are from pvcl.c, and are stripped down for readability.

void gd54xx_setsda (int state)

{

/* Switch on I2C interface */

set_bit(6, &i2c_state);

/* Set/Clear bit */

state ? set_bit(1, &i2c_state) : clear_bit(1, &i2c_state);

gd_write_sr(, i2c_state, 0x8);

}

set_bit(n, variable) switches on the nth bit of variable, counting from the least significant bit. It is provide by the kernel. see (asm/bitops.h). clear_bit, similarly clears the nth bit. i2c_state is a variable, which holds the current settings of the SR8 VGA register.

What basically happens here is that gd54xx_setsda (1) pulls the SDA line high, while gd54xx_setsda (0), pulls it low.

set_scl() works similarly, except that the SCL pin is affected.

Getting the current status of the SDA pin works by reading the corresponding status bit from SR8. In this case, it is bit 7. If the SDA pin is high, bit 7 will be equal to 1. If it is low, bit 7 will be 0. This can be read into a variable, as shown below:

int gd54xx_getsda (i2c_state)

{

: return (((i2c_state = gd_read_sr(, 0x8)) »7)&0x1);

}

In order to get the big picture about the I2C system within the kernel, we need to understand certain I2C concepts which are implemented within the kernel.

The first, is the concept of an adapter.

linux/i2c.h says: " i2c_adapter is the structure used to identify a physical i2c bus along with the access algorithms necessary to access it."

In our case, the GD-5446 I2C bus along with the bit-bang access algorithm, make up the adapter.

Next comes the algorithm:

Here's what (linux/i2c.h) has to say about access algorithms:

"(an access algorithm) ... is the interface to a class of hardware solutions which can be addressed using the same bus algorithms - i.e. bit-banging or the PCF8584 to name two of the most common."

The gd54xx_setsda(), gd54xx_getsda(), gd54xx_setscl() and gd54xx_getscl() functions, are helper functions for the bit-bang access algorithm. Consequently, they would not have existed if the GD-5446 I2C bus used some other mechanism, such as a PCF 8584 I2C interface.

The third concept we have to deal with is that of an I2C client.

Once again (linux/i2c.h) is the authoritative reference:

"(A client) ... identifies a single device (i.e. chip) that is connected to an i2c bus."

In our case, we have just two clients: the VPX-3225D and the Phillips FM1216ME MK3 tuner module. The I2C protocol makes sure that only one chip is accessed at a time, by assigning certain addresses to certain chips. Therefore, every client has an address number associated with it. The VPX-3225D only responds to addresses 0x86 and 0x87 or, addresses 0x8e and 0x8f, depending on how the chip is configured. The tuner module responds to address 0xc6.

Every I2C transaction is prefixed by a target address. This must be done by the master. Only addressed slaves, may thus respond to queries from the bus master. This may also be used as a method to probe the I2C bus to see if it can detect any chips. The Linux kernel supports this kind of probing.

do:

: root@maverick# modprobe i2c-algo-bit bit_scan=1

This will make the kernel i2c core module scan the entire address range of the bit_bang adapter, to probe for connected chips. Any finds are reported via the kernel logs. Thus a client contains the following information about a connected chip:

An identifier name.
The address to which it responds.
The adapter on which it is connected.
The device driver in charge of programming it.

This leads us to the fourth concept about the I2C subsystem - the I2C driver. Let's see what (linux/i2c.h) has to say about this bewildering concept:

"A driver is capable of handling one or more physical devices present on I2C adapters. This information is used to inform the driver of adapter events."

At first it may seem funny that we're talking about another device driver within a device driver! But you notice that there may be more than one chip on a given adapter, and each chip needs to be programmed separately. Any piece of code, which understands the working of a piece of hardware, and programs it accordingly, may be called a driver. In this case, the driver may be just a couple of routines within a module, and there may be more than one driver, in that sense, within a kernel module.

It might be instructive to note that I've implemented the I2C driver for the VPX-3225D within another file called vpx322xd.c This separates the code between the main v4l driver, and the vpx part neatly. The two drivers would talk to each other via an internal arrangement similar to that of the IOCTL call in user space. Interestingly, the driver for the Phillips FM1216ME MK3 tuner module, is already available with the 2.4 kernel, and may be compiled as a separate module. This is an example of how open source works so well. I provide the adapter and windowing functions, somebody else provides the tuner driver to work over my adapter, I have a video processor module to add to that, and yet someone else, has written the video4linux user space client, which understands the V4L API. Cool, eh ?

To understand how to code the I2C driver for the video processor (the VPX-3225D, in this case), we need to know two things - the context in which our code runs, and the environment within which it runs.

Handling an IOCTL from within an I2C driver

When all is said and done, the purpose of the VPX-3225D driver, is to implement instructions passed down from the application. A generic I2C driver registers something called a ``command'' function, when it registers itself with the Linux I2C core. Once registered, this command function may be called by tracing it through a list of available I2C adapters. The linked list goes this way: adapter-> clients[n]-> driver-> command , where n is the nth client on an adapter. Therefore, adapter-> clients[n]-> driver-> command() would translate to ``call the command function associated with the driver for client ``n'' which resides on adapter''. The adapter structure is of course, accessible from the main V4L driver, pvcl.c, which registered that adapter in the first place. Therefore, all clients on that adapter, and hence, all client drivers and their callback ``command'' routines are accessible from pvcl.c by simply traversing through the adapter structure.

API figure

Let's trace through an ioctl() call for switching on capture.

glibc stub code calls kernel routine, thereby switching to kernel mode.
Once in kernel mode, the VFS layer identifies that it is an ioctl() call, and transfers the call to the V4L layer.
The V4L layer searches for registered tuner drivers, discovers the driver registered in pvcl.c, and gives control to pvcl_ioctl()
pvcl_ioctl() traverses through a list of IOCTLS that it can do. It identifies that a ``switch on capture'' request has been received. Since switching on capture is not implemented by the GD-5446 chip, but by the VPX chip, pvcl_ioctl translates the command to ``VPROC_START_CAPTURE'', and transfers control to do_client_ioctl().
do_client_ioctl() searches for clients on the GD-5446 I2C bus, and calls their respective command() routines one by one.
As mentioned before, two clients are typically attached to the I2C bus. They are the VPX-3225D and the tuner module. For details about the tuner module IOCTL handling, have a look at the function tuner_command() within drivers/media/video/tuner.c Since VPROC_START_CAPTURE has no meaning in tuner.c, it ignores it. do_client_parses through the rest of the list and calls vpx_command() in vpx322xd.c
In vpx322xd.c, the function vpx_command() gets control. It goes through a switch() statement similar to that in pvcl_ioctl() in pvcl.c and identifies that capture is to be switched on. It then calls vpx_start_capture(), which does all the hardware conversation with the VPX-3225D chip, and switches on capture. Now the VPX is vigorously capturing data to the GD-5446, via the VPort.

vpx_start_capture() and friends, are little functions which do small, but specific jobs. Like the gd_xxxx_() series of calls within the pvcl.c file, they make use of lower level functions for hardware access. In this case, instead of gd_write_xr()/gd_read_xr(), vpx_read_byte()/vpx_write_byte() are used. Those functions further depend on lower level functions provided by the i2c core layer, like i2c_smbus_read_byte_data() . These functions take care of the exact I2C handshake details for talking to the VPX chip over the I2C bus.

Thus ends the section.

6 The PCI bus

The PCI bus, is the most common bus used in today's computers. (For really innocent novices: A bus, is any piece of wire or set of wires, on which more than one peripheral is connected to at the same time, and therefore has be treated as a shared resource.) Apart from speed (33MHz up-wards), the PCI bus is a plug and play bus. This has nothing to do with the wires, of course. The wires on a PCI bus are as brain dead, as the wires in my table lamp. The difference is that any device connected to the PCI bus, must behave in accordance to a set of rules called the PCI specification. Among other things, PCI devices, ie; devices which are connected to the PCI bus, need to give information to the Bus Master about the Name, Type and number of functional Chips, their preferred IRQ lines, DMA capability etc. This helps the bus master share the resources of the bus effectively. The bus master in this case, would be a proxy of the system processor, usually a ``steering device'' or a ``bridge device''. We won't go into the details here. What interests us as tuner card device driver writers are three things:

Device Identification, DMA, IRQ line allocation.

6.1 Device Identification

Linux provides a set of functions for accessing information about PCI devices. These functions talk with the PCI hardware, and have already obtained details about all cards which are connected. What concerns us is identifying the Chip on board. pci_find_device() fills in a structure, with the name of the card, the Vendor ID of the card, and the Chip ID of the chip on board. These IDs are available in linux/pci_ids.h. They are available there, because each of the chip manufacturers has registered their devices in a central, public database beforehand.

In the case of the Pixelview card, the task of identifying the GD-5446 is very simple. Look for the PCI_VENDOR_ID_CIRRUS and PCI_DEVICE_ID_CIRRUS_5446. If both fields are available in the card database, then the card is indeed controlled by the CL-GD5446. Look for the probing function in i2c_clgd54xx_find_card() in pvcl.c, for info about how this is done.

6.2 DMA

Like any other bus, the PCI system allows transfer of data only between one master, and one slave. The master initiates the conversation, and the slave responds with data, or requests. On the PCI bus, the master, is usually a proxy of the system processor. This chip, behaves like the system processor itself, bossing all other chips into submission. Effectively, system devices see the processor in the proxy, and obey its instructions. But the processor is a very busy chip, and cannot devote itself to transferring data between PCI chips without giving up on performance. So the bus is designed to occasionally allow other slave chips to become masters, under the delegation of the system processor. In such cases, the new master of the bus has control over the PCI bus, and can initiate any type of transfer it likes. Of course, this mastership is on a lease of time, and the moment the processor desires so, the upstart has its rights revoked and is put in its place, and the processor takes over.

Let's take the case of a tuner card, which desires to transfer data to the VGA card. The tuner card chip, indicates its desire to do so, by raising a DMA request, on a special line called DREQ, on the PCI bus. The PCI controller chip, in consultation with the processor (via other lines external to the PCI bus), grants or revokes the request. Once the request is granted, the tuner card can address the VGA chip, just like the processor would, and it could initiate a transfer of data over the PCI bus, with the system processor happily going about other jobs. If ever the processor needed to access the VGA chip as well, it would only need to revoke the tuner card's bus rights, and write to the VGA chip, as usual.

In older buses like the ISA bus, a dedicated chip called the DMA controller was used for delegated bus mastering. It was the responsibility of the system kernel to allocate resources on the DMA controller itself, and thus the advantages of DMA were limited to a small number of devices, on such busses. In the case of PCI, any chip may become bus master, and the DMA controller would be placed on the individual card itself. This would make contention of the request line, DREQ, the only bottleneck. To alleviate the problem, multiple DREQ lines are available on the PCI bus, with the PCI bus controller arbitrating between simultaneous DREQs on multiple lines.

6.3 IRQ lines

Devices need to indicate to the processor, events which are not predictable beforehand. Such events are called asynchronous events. Examples of Asynchronous events are: The arrival of a packet of data on a network card, the opening of the CD-ROM tray, the completion of filling a frame of video data by a video processor, etc.

Asynchronous events, are indicated by devices by using a line on the PCI bus called the Interrupt Request Queue (IRQ) line. IRQ lines, are scarce resources on a bus, and the PCI bus is no exception. However, IRQ lines may be shared between devices, if there were some means to discern between multiple parties sharing the same line. The code responsible for handling IRQ requests is called the Interrupt Service Routine (ISR). If an IRQ is indicated by some chip, the processor immediately switches to the ISR. The ISR then reads registers on each suspect device, until it finds which device on the shared line was the culprit for raising the IRQ, and does whatever needs to be done in servicing that request. Servicing might include tasks like saving the newly arrived packet, flushing system buffers, or resetting the pointers within a video processor. Each of these tasks is device specific, and hence, the device driver must contain the ISR, which is registered with the system kernel, so that it may be called at Interrupt time.

7 Writing your own driver.

Nobody writes code from scratch. The very few who do, have very specific reasons for doing so, and even then, they rely on code templates, or ideas borrowed from their own or others' code. So if you are a budding device driver writer, the best way to start would be to read through device driver code which is already available in the Linux kernel. Don't worry, nobody will accuse you of plagiarism - the Gnu Public License (GPL) under which the Linux kernel is released, actually encourages code re-use. As long as you don't make verbatim copies of somebody else's code and change the authors' name to your own, you're free to use the kernel code. Any new part of existing code, may be claimed by you. Of course, remember that any GPL code which is altered, although the changes may be copy righted to you, may only be released again, under the terms of the GPL.

Appendix A - The Pixelview Combo TV Plus device driver code. (alpha).

Click on the following links to see the source code.

An unofficial patch of the author's Linux Driver for the Pixelview Combo TV plus TV tuner card, is available for download at http://cherry.freeshell.org/downloads/

Bibliography

1: The Linux kernel sources
2: The Cirrus Logic 5446 Technical Reference Manual
3: The VPX 3225D Technical Reference Manual
4: video4linux-list@redhat.com, the video4linux-list email archives.

About this document ...

Tuner Cards - Learning by looking.

This document was generated using the LaTeX2HTML translator Version 2K.1beta (1.48)

The command line arguments were:
latex2html -no_subdir -split 0 -show_section_numbers /tmp/lyx_tmpdir12763rVg3I/lyx_tmpbuf1276gZAXat/article.tuner.tex

The translation was initiated by Cherry George Mathew on 2003-05-20

Footnotes

...P ¹: All references to the "Linux kernel" refer to kernel version 2.4 and above.
...Documentation/DocBook/videobook.tmpl)²: All paths are w.r.t the Linux source root. For example if the Linux source root is /usr/src/linux then Documentation/DocBook/videobook.tmpl will be at /usr/src/linux/Documentation/DocBook/videobook.tmpl
...³: All include paths are w.r.t the linux source default include directory.
... space ⁴: In monolithic kernels like Linux, all applications run in user space. The kernel and its device drivers run in kernel space.

[BIO] Cherry is a graduate in Electronics Engineering, who lives in the Indian City of Bangalore. His favourite hobbies are Reading novels, playing the Guitar, and Hacking Code.

Total memory required for VGA display	=	640 x 480 x 2 bytes
	=	614400 bytes
	=	0.59 MB

Unused Memory at the end of the Frame buffer	=	2MB - 0.59MB
	=	1.41 MB