A hardware context refers to the current state of the card's hardware, namely, GPU registers and command FIFO, among other things. The interested reader will refer to this wikipedia page: Context_switch which explains the notion of hardware context, and context switching, in the case of a central processing unit. The GPU case does not conceptually differ.

Context switching : why?

The point of context switching is to allow several clients to access the card at the same time, without bad hardware level interference. Those clients will typically be the X server and OpenGL applications. Indeed, the 2D driver owns one hardware context, while every OpenGL client takes one more.

This means that running even a single OpenGL application requires having working context switches, because there will be two hardware contexts (one for X, one for the GL app). (actually, it seems that darktama managed to run an OpenGL application along with X, with no context switch, but with an important restriction - see older TiNDC for source)

The NVidia cards provide several command channels, each one being associated to a given hardware context. This means that, until all channels are used, each graphics client will have its own channel and hardware context on the card.

Context switching : how?

The swap between two contexts is not done the same way by all cards. The most recent Nvidia cards do it automatically, but require a special initialisation, while the older ones need the driver to handle it itself.

Until NV10, the context switches are done by the driver and are interrupt-driven : whenever the card gets a command on a channel that is currently not active, it will send a PGRAPH interrupt (a PGRAPH interrupt is an interrupt sent by the card's graphics engine) to the driver, which will have to save the card's registers, and restore those of the new context.

Starting from NV20, the context switches are done in hardware by the GPU, and counting from NV40, the cards require a special microcode, called ctxprogs.

Context switching is now working for all cards. For the cards that need a ctxprog (NV4x+), we used to copy the one sent by the proprietary driver but now, a ctxprog generator has been written for them.

The 'How To' for cards using ctxprogs is here: ?CtxInit

(Open question: What exactly is the conceptual relationship between the term "hardware context" and the formal abstract type GLContext?)

What more?

We could theorically get away with only one hardware context, by having the driver handle everything itself (this conceptually means that we'd implement the context switches in software). However, this approach was discarded by marcheu and airlied as being inefficient (some technical arguments for this are available in older TiNDCs and IRC logs).

This piece of text is worth reading too. (Note that NV1 was using a MMIO FIFO, whereas NV4+ uses a DMA FIFO.)