How do I translate single objects in OpenGL 3.x? - opengl

I have a bit of experience writing OpenGL 2 applications and want to learn using OpenGL 3. For this I've bought the Addison Wesley "Red-book" and "Orange-book" (GLSL) which descirbe the deprecation of the fixed functionality and the new programmable pipeline (shaders). But what I can't get a grasp of is how to construct a scene with multiple objects without using the deprecated translate*, rotate* and scale* functions.
What I used to do in OGL2 was to "move about" in 3D space using the translate and rotate functions, and create the objects in local coordinates where I wanted them using glBegin ... glEnd. In OGL3 these functions are all deprecated, and, as I understand, replaced by shaders. But I can't call a shaderprogram for each and every object I make, can I? Wouldn't this affect all the other objects too?
I'm not sure if I've explained my problem satisfactory, but the core of it is how to program a scene with multiple objects defined in local coordinates in OpenGL 3.1. All the beginner tutorials I've found only uses a single object and doesn't have/solve this problem.
Edit: Imagine you want two spinning cubes. It would be a pain manually modifying each vertex coordinate, and you can't simply modify the modelview-matrix, because that would rather spin the camera around two static cubes...

Let's start with the basics.
Usually, you want to transform your local triangle vertices through the following steps:
local-space coords-> world-space coords -> view-space coords -> clip-space coords
In standard GL, the first 2 transforms are done through GL_MODELVIEW_MATRIX, the 3rd is done through GL_PROJECTION_MATRIX
These model-view transformations, for the many interesting transforms that we usually want to apply (say, translate, scale and rotate, for example), happen to be expressible as vector-matrix multiplication when we represent vertices in homogeneous coordinates. Typically, the vertex V = (x, y, z) is represented in this system as (x, y, z, 1).
Ok. Say we want to transform a vertex V_local through a translation, then a rotation, then a translation. Each transform can be represented as a matrix*, let's call them T1, R1, T2.
We want to apply the transform to each vertex: V_view = V_local * T1 * R1 * T2. Matrix multiplication being associative, we can compute once and for all M = T1 * R1 * T2.
That way, we only need to pass down M to the vertex program, and compute V_view = V_local * M. In the end, a typical vertex shader multiplies the vertex position by a single matrix. All the work to compute that one matrix is how you move your object from local space to the clip space.
Ok... I glanced over a number of important details.
First, what I described so far only really covers the transformation we usually want to do up to the view space, not the clip space. However, the hardware expects the output position of the vertex shader to be represented in that special clip-space. It's hard to explain clip-space coordinates without significant math, so I will leave that out, but the important bit is that the transformation that brings the vertices to that clip-space can usually be expressed as the same type of matrix multiplication. This is what the old gluPerspective, glFrustum and glOrtho compute.
Second, this is what you apply to vertex positions. The math to transform normals is somewhat different. That's because you want the normal to stay perpendicular to the surface after transformation (for reference, it requires a multiplication by the inverse-transpose of the model-view in the general case, but that can be simplified in many cases)
Third, you never send 4-D coordinates to the vertex shader. In general you pass 3-D ones. OpenGL will transform those 3-D coordinates (or 2-D, btw) to 4-D ones so that the vertex shader does not have to add the extra coordinate. it expands each vertex to add the 1 as the w coordinate.
So... to put all that back together, for each object, you need to compute those magic M matrices based on all the transforms that you want to apply to the object. Inside the shader, you then have to multiply each vertex position by that matrix and pass that to the vertex shader Position output. Typical code is more or less (this is using old nomenclature):
mat4 MVP;
gl_Position=MVP * gl_Vertex;
* the actual matrices can be found on the web, notably on the man pages for each of those functions: rotate, translate, scale, perspective, ortho

Those functions are apparently deprecated, but are technically still perfectly functional and indeed will compile. So you can certainly still use the translate3f(...) etc functions.
HOWEVER, this tutorial has a good explanation of how the new shaders and so on work, AND for multiple objects in space.
You can create x arrays of vertexes, and bind them into x VAO objects, and you render the scene from there with shaders etc...meh, it's easier for you to just read it - it is a really good read to grasp the new concepts.
Also, the OpenGL 'Red Book' as it is called has a new release - The Official Guide to Learning OpenGL, Versions 3.0 and 3.1. It includes 'Discussion of OpenGL’s deprecation mechanism and how to verify your programs for future versions of OpenGL'.
I hope that's of some assistance!

Related

Rotating, scaling and translating 2d points in GLM

I am currently trying to create a 2d game and I am a bit shocked that I can not find any transformations like rotation, scale, translate for vec2.
For example as far as I know rotate is only available for a mat4x4 and mat4x4 can only be multiplied by a vec4.
Is there any reason for that?
I want to calculate my vertices on the CPU
std::vector<glm::mat2> m;
Gather all matrices, generate the vertices, fill the GPU buffer and then draw everything in one draw call.
How would I do this in glm? Just use a mat4 and then ignore the z and w component?
One thing that you should understand is that you can't just ignore the z component, let alone the w component. It depends on how you want to view the problem. The problem is that you want to use a mat2 and a vec2 for 2D, but alas it's not as simple as you may think. You are using OpenGL, and presumably it's with shaders too. You need to use at least a glm::mat3, or better: mat4. The reason for this is that although you want everything in 2D, it has to be done in terms of 3D. 2D is really just 3D with the z-buffer being 1 and the clip pane simply static, relative to the Window size.
So, what I suggest is this:
For your model matrix, you have it as a glm::mat4 which contains all data, even if you don't intend to use it. Consistency is important here, especially for transformations.
Don't ignore the z and w component in glm::mat4; they are important because their values dictate where they are on screen. OpenGL needs to know where in the z-plane a value is. And for matrix multiplication, homogenous coordinates are required, so the w component is also important. In other words, you are virtually stuck with glm::mat4.
To get transformations for glm::mat4 you should
#include <glm/gtc/matrix_transform.hpp>
Treat your sprites separately; as in have a Sprite class, and group them outside that class. Don't do the grouping recursively as this leads to messy code. Generating vertices should be done on a per-Sprite basis; don't worry about optimisation because OpenGL takes care of that for you, let alone the compiler on the C++ side of things. You'll soon realise that by breaking down this problem you can have something like
std::vector<Sprite*> sprites;
for (const auto& : i)
i->init();
// etc...
for (const auto& : i)
i->render();
With regards to the shaders, you shouldn't really have them inside the Sprite class. Have a resource loader, and simply have each Sprite class retrieve the shader from that loader.
And most importantly: remember the order of transformations!
With regards to transformations of sprites, what you can do is have a glm::vec3 for sprite positions, setting the z-component to 0. You can then move your sprite by simply having a function which wants x and y values. Feed these values into the model matrix using glm::translate(..). With regards to rotation, you use glm::rotate() and simply have functions which get rotation angles. ALWAYS ROTATE ON THE Z-PANE thus it should be similar to this:
modelMatrix = glm::rotate(modelMatrix, glm::radians(angle), glm::vec3(0.f, 0.f, 1.f));
As for scaling, again a suitable setScale() function that accepts two values for x and y. Set the z component to 1 to prevent the Sprite from being scaled in z. Feed the values into glm::scale like so:
modelMatrix = glm::scale(modelMatrix, glm::vec3(scale));
Remember, that storing matrices provides little benefit because they're just numbers; they don't indicate what you really want to do with them. It's better for a Sprite class to encapsulate a matrix so that it's clear what they stand for.
You're welcome!
It seems that I haven't looked hard enough:
GLM_GTX_matrix_transform_2d from glm/gtx/matrix_transform_2d.hpp.

Confused about OpenGL transformations

In opengl there is one world coordinate system with origin (0,0,0).
What confuses me is what all the transformations like glTranslate, glRotate, etc. do? Do they move
objects in world coordinates, or do they move the camera? As you know, the same movement can be achieved by either moving objects or camera.
I am guessing that glTranslate, glRotate, change objects, and gluLookAt changes the camera?
In opengl there is one world coordinate system with origin (0,0,0).
Well, technically no.
What confuses me is what all the transformations like glTranslate, glRotate, etc. do? Do they move objects in world coordinates, or do they move the camera?
Neither. OpenGL doesn't know objects, OpenGL doesn't know a camera, OpenGL doesn't know a world. All that OpenGL cares about are primitives, points, lines or triangles, per vertex attributes, normalized device coordinates (NDC) and a viewport, to which the NDC are mapped to.
When you tell OpenGL to draw a primitive, each vertex is processed according to its attributes. The position is one of the attributes and usually a vector with 1 to 4 scalar elements within local "object" coordinate system. The task at hand is to somehow transform the local vertex position attribute into a position on the viewport. In modern OpenGL this happens within a small program, running on the GPU, called a vertex shader. The vertex shader may process the position in an arbitrary way. But the usual approach is by applying a number of nonsingular, linear transformations.
Such transformations can be expressed in terms of homogenous transformation matrices. For a 3 dimensional vector, the homogenous representation in a vector with 4 elements, where the 4th element is 1.
In computer graphics a 3-fold transformation pipeline has become sort of the standard way of doing things. First the object local coordinates are transformed into coordinates relative to the virtual "eye", hence into eye space. In OpenGL this transformation used to be called the modelview transformaion. With the vertex positions in eye space several calculations, like illumination can be expressed in a generalized way, hence those calculations happen in eye space. Next the eye space coordinates are tranformed into the so called clip space. This transformation maps some volume in eye space to a specific volume with certain boundaries, to which the geometry is clipped. Since this transformation effectively applies a projection, in OpenGL this used to be called the projection transformation.
After clip space the positions get "normalized" by their homogenous component, yielding normalized device coordinates, which are then plainly mapped to the viewport.
To recapitulate:
A vertex position is transformed from local to clip space by
vpos_eye = MV · vpos_local
eyespace_calculations(vpos_eye);
vpos_clip = P · vpos_eye
·: inner product column on row vector
Then to reach NDC
vpos_ndc = vpos_clip / vpos_clip.w
and finally to the viewport (NDC coordinates are in the range [-1, 1]
vpos_viewport = (vpos_ndc + (1,1,1,1)) * (viewport.width, viewport.height) / 2 + (viewport.x, viewport.y)
*: vector component wise multiplication
The OpenGL functions glRotate, glTranslate, glScale, glMatrixMode merely manipulate the transformation matrices. OpenGL used to have four transformation matrices:
modelview
projection
texture
color
On which of them the matrix manipulation functions act on can be set using glMatrixMode. Each of the matrix manipulating functions composes a new matrix by multiplying the transformation matrix they describe on top of the select matrix thereby replacing it. The functions glLoadIdentity replace the current matrix with identity, glLoadMatrix replaces it with a user defined matrix, and glMultMatrix multiplies a user defined matrix on top of it.
So how does the modelview matrix then emulate both object placement and a camera. Well, as you already stated
As you know, the same movement can be achieved by either moving objects or camera.
You can not really discern between them. The usual approach is by splitting the object local to eye transformation into two steps:
Object to world – OpenGL calls this the "model transform"
World to eye – OpenGL calls this the "view transform"
Together they form the model-view, in fixed function OpenGL described by the modelview matrix. Now since the order of transformations is
local to world, Model matrix vpos_world = M · vpos_local
world to eye, View matrix vpos_eye = V · vpos_world
we can substitute by
vpos_eye = V · ( M · vpos_local ) = V · M · vpos_local
replacing V · M by the ModelView matrix =: MV
vpos_eye = MV · vpos_local
Thus you can see that what's V and what's M of the compund matrix M is only determined by the order of operations in which you multiply onto the modelview matrix, and at which step you decide to "call it the model transform from here on".
I.e. right after a
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
the view is defined. But at some point you'll start applying model transformations and everything after is model.
Note that in modern OpenGL all the matrix manipulation functions have been removed. OpenGL's matrix stack never was feature complete and no serious application did actually use it. Most programs just glLoadMatrix-ed their self calculated matrices and didn't bother with the OpenGL built-in matrix maniupulation routines.
And ever since shaders were introduced, the whole OpenGL matrix stack got awkward to use, to say it nicely.
The verdict: If you plan on using OpenGL the modern way, don't bother with the built-in functions. But keep in mind what I wrote, because what your shaders do will be very similar to what OpenGL's fixed function pipeline did.
OpenGL is a low-level API, there is no higher-level concepts like an "object" and a "camera" in the "scene", so there are only two matrix modes: MODELVIEW (a multiplication of "camera" matrix by the "object" transformation) and PROJECTION (the projective transformation from world-space to post-perspective space).
Distinction between "Model" and "View" (object and camera) matrices is up to you. glRotate/glTranslate functions just multiply the currently selected matrix by the given one (without even distinguishing between ModelView and Projection).
Those functions multiply (transform) the current matrix set by glMatrixMode() so it depends on the matrix you're working on. OpenGL has 4 different types of matrices; GL_MODELVIEW, GL_PROJECTION, GL_TEXTURE, and GL_COLOR, any one of those functions can change any of those matrices. So, basically, you don't transform objects you just manipulate different matrices to "fake" that effect.
Note that glulookat() is just a convenient function equivalent to a translation followed by some rotations, there's nothing special about it.
All transformations are transformations on objects. Even gluLookAt is just a transformation to transform the objects as if the camera was where you tell it to be. Technically they are transformations on the vertices, but that's just semantics.
That's true, glTranslate, glRotate change the object coordinates before rendering and gluLookAt changes the camera coordinate.

Why would it be beneficial to have a separate projection matrix, yet combine model and view matrix?

When you are learning 3D programming, you are taught that it's easiest think in terms of 3 transformation matrices:
The Model Matrix. This matrix is individual to every single model and it rotates and scales the object as desired and finally moves it to its final position within your 3D world. "The Model Matrix transforms model coordinates to world coordinates".
The View Matrix. This matrix is usually the same for a large number of objects (if not for all of them) and it rotates and moves all objects according to the current "camera position". If you imaging that the 3D scene is filmed by a camera and what is rendered on the screen are the images that were captured by this camera, the location of the camera and its viewing direction define which parts of the scene are visible and how the objects appear on the captured image. There are little reasons for changing the view matrix while rendering a single frame, but those do in fact exists (e.g. by rendering the scene twice and changing the view matrix in between, you can create a very simple, yet impressive mirror within your scene). Usually the view matrix changes only once between two frames being drawn. "The View Matrix transforms world coordinates to eye coordinates".
The Projection Matrix. The projection matrix decides how those 3D coordinates are mapped to 2D coordinates, e.g. if there is a perspective applied to them (objects get smaller the farther they are away from the viewer) or not (orthogonal projection). The projection matrix hardly ever changes at all. It may have to change if you are rendering into a window and the window size has changed or if you are rendering full screen and the resolution has changed, however only if the new window size/screen resolution has a different display aspect ratio than before. There are some crazy effects for that you may want to change this matrix but in most cases its pretty much constant for the whole live of your program. "The Projection Matrix transforms eye coordinates to screen coordinates".
This makes all a lot of sense to me. Of course one could always combine all three matrices into a single one, since multiplying a vector first by matrix A and then by matrix B is the same as multiplying the vector by matrix C, where C = B * A.
Now if you look at the classical OpenGL (OpenGL 1.x/2.x), OpenGL knows a projection matrix. Yet OpenGL does not offer a model or a view matrix, it only offers a combined model-view matrix. Why? This design forces you to permanently save and restore the "view matrix" since it will get "destroyed" by model transformations applied to it. Why aren't there three separate matrices?
If you look at the new OpenGL versions (OpenGL 3.x/4.x) and you don't use the classical render pipeline but customize everything with shaders (GLSL), there are no matrices available any longer at all, you have to define your own matrices. Still most people keep the old concept of a projection matrix and a model-view matrix. Why would you do that? Why not using either three matrices, which means you don't have to permanently save and restore the model-view matrix or you use a single combined model-view-projection (MVP) matrix, which saves you a matrix multiplication in your vertex shader for ever single vertex rendered (after all such a multiplication doesn't come for free either).
So to summarize my question: Which advantage has a combined model-view matrix together with a separate projection matrix over having three separate matrices or a single MVP matrix?
Look at it practically. First, the fewer matrices you send, the fewer matrices you have to multiply with positions/normals/etc. And therefore, the faster your vertex shaders.
So point 1: fewer matrices is better.
However, there are certain things you probably need to do. Unless you're doing 2D rendering or some simple 3D demo-applications, you are going to need to do lighting. This typically means that you're going to need to transform positions and normals into either world or camera (view) space, then do some lighting operations on them (either in the vertex shader or the fragment shader).
You can't do that if you only go from model space to projection space. You cannot do lighting in post-projection space, because that space is non-linear. The math becomes much more complicated.
So, point 2: You need at least one stop between model and projection.
So we need at least 2 matrices. Why model-to-camera rather than model-to-world? Because working in world space in shaders is a bad idea. You can encounter numerical precision problems related to translations that are distant from the origin. Whereas, if you worked in camera space, you wouldn't encounter those problems, because nothing is too far from the camera (and if it is, it should probably be outside the far depth plane).
Therefore: we use camera space as the intermediate space for lighting.
In most cases your shader will need the geometry in world or eye coordinates for shading so you have to seperate the projection matrix from the model and view matrices.
Making your shader multiply the geometry with two matrices hurts performance. Assuming each model have thousends (or more) vertices it is more efficient to compute a model view matrix in the cpu once, and let the shader do one less mtrix-vector multiplication.
I have just solved a z-buffer fighting problem by separating the projection matrix. There is no visible increase of the GPU load. The two folowing screenshots shows the two results - pay attention to the green and white layers fighting.

OpenGL define vertex position in pixels

I've been writing a 2D basic game engine in OpenGL/C++ and learning everything as I go along. I'm still rather confused about defining vertices and their "position". That is, I'm still trying to understand the vertex-to-pixels conversion mechanism of OpenGL. Can it be explained briefly or can someone point to an article or something that'll explain this. Thanks!
This is rather basic knowledge that your favourite OpenGL learning resource should teach you as one of the first things. But anyway the standard OpenGL pipeline is as follows:
The vertex position is transformed from object-space (local to some object) into world-space (in respect to some global coordinate system). This transformation specifies where your object (to which the vertices belong) is located in the world
Now the world-space position is transformed into camera/view-space. This transformation is determined by the position and orientation of the virtual camera by which you see the scene. In OpenGL these two transformations are actually combined into one, the modelview matrix, which directly transforms your vertices from object-space to view-space.
Next the projection transformation is applied. Whereas the modelview transformation should consist only of affine transformations (rotation, translation, scaling), the projection transformation can be a perspective one, which basically distorts the objects to realize a real perspective view (with farther away objects being smaller). But in your case of a 2D view it will probably be an orthographic projection, that does nothing more than a translation and scaling. This transformation is represented in OpenGL by the projection matrix.
After these 3 (or 2) transformations (and then following perspective division by the w component, which actually realizes the perspective distortion, if any) what you have are normalized device coordinates. This means after these transformations the coordinates of the visible objects should be in the range [-1,1]. Everything outside this range is clipped away.
In a final step the viewport transformation is applied and the coordinates are transformed from the [-1,1] range into the [0,w]x[0,h]x[0,1] cube (assuming a glViewport(0, w, 0, h) call), which are the vertex' final positions in the framebuffer and therefore its pixel coordinates.
When using a vertex shader, steps 1 to 3 are actually done in the shader and can therefore be done in any way you like, but usually one conforms to this standard modelview -> projection pipeline, too.
The main thing to keep in mind is, that after the modelview and projection transforms every vertex with coordinates outside the [-1,1] range will be clipped away. So the [-1,1]-box determines your visible scene after these two transformations.
So from your question I assume you want to use a 2D coordinate system with units of pixels for your vertex coordinates and transformations? In this case this is best done by using glOrtho(0.0, w, 0.0, h, -1.0, 1.0) with w and h being the dimensions of your viewport. This basically counters the viewport transformation and therefore transforms your vertices from the [0,w]x[0,h]x[-1,1]-box into the [-1,1]-box, which the viewport transformation then transforms back to the [0,w]x[0,h]x[0,1]-box.
These have been quite general explanations without mentioning that the actual transformations are done by matrix-vector-multiplications and without talking about homogenous coordinates, but they should have explained the essentials. This documentation of gluProject might also give you some insight, as it actually models the transformation pipeline for a single vertex. But in this documentation they actually forgot to mention the division by the w component (v" = v' / v'(3)) after the v' = P x M x v step.
EDIT: Don't forget to look at the first link in epatel's answer, which explains the transformation pipeline a bit more practical and detailed.
It is called transformation.
Vertices are set in 3D coordinates which is transformed into a viewport coordinates (into your window view). This transformation can be set in various ways. Orthogonal transformation can be easiest to understand as a starter.
http://www.songho.ca/opengl/gl_transform.html
http://www.opengl.org/wiki/Vertex_Transformation
http://www.falloutsoftware.com/tutorials/gl/gl5.htm
Firstly be aware that OpenGL not uses standard pixel coordinates. I mean by that for particular resolution, ie. 800x600 you dont have horizontal coordinates in range 0-799 or 1-800 stepped by one. You rather have coordinates ranged from -1 to 1 later send to graphic card rasterizing unit and after that matched to particular resolution.
I ommited one step here - before all that you have an ModelViewProjection matrix (or viewProjection matrix in some simple cases) which before all that will cast coordinates you use to an projection plane. Default use of that is to implement a camera which converts 3D space of world (View for placing an camera into right position and Projection for casting 3d coordinates into screen plane. In ModelViewProjection it's also step of placing a model into right place in world).
Another case (and you can use Projection matrix this way to achieve what you want) is to use these matrixes to convert one range of resolutions to another.
And there's a trick you will need. You should read about modelViewProjection matrix and camera in openGL if you want to go serious. But for now I will tell you that with proper matrix you can just cast your own coordinate system (and ie. use ranges 0-799 horizontaly and 0-599 verticaly) to standarized -1:1 range. That way you will not see that underlying openGL api uses his own -1 to 1 system.
The easiest way to achieve this is glOrtho function. Here's the link to documentation:
http://www.opengl.org/sdk/docs/man/xhtml/glOrtho.xml
This is example of proper usage:
glMatrixMode (GL_PROJECTION)
glLoadIdentity ();
glOrtho (0, 800, 600, 0, 0, 1)
glMatrixMode (GL_MODELVIEW)
Now you can use own modelView matrix ie. for translation (moving) objects but don't touch your projection example. This code should be executed before any drawing commands. (Can be after initializing opengl in fact if you wont use 3d graphics).
And here's working example: http://nehe.gamedev.net/tutorial/2d_texture_font/18002/
Just draw your figures instead of drawing text. And there is another thing - glPushMatrix and glPopMatrix for choosen matrix (in this example projection matrix) - you wont use that until you combining 3d with 2d rendering.
And you can still use model matrix (ie. for placing tiles somewhere in world) and view matrix (in example for zooming view, or scrolling through world - in this case your world can be larger than resolution and you could crop view by simple translations)
After looking at my answer I see it's a little chaotic but If you confused - just read about Model, View, and Projection matixes and try example with glOrtho. If you're still confused feel free to ask.
MSDN has a great explanation. It may be in terms of DirectX but OpenGL is more-or-less the same.
Google for "opengl rendering pipeline". The first five articles all provide good expositions.
The key transition from vertices to pixels (actually, fragments, but you won't be too far off if you think "pixels") is in the rasterization stage, which occurs after all vertices have been transformed from world-coordinates to screen coordinates and clipped.

How to create 4-dimensional textures?

EDIT
glTexcoord4f allows to specif four dimensions of a texture, but how do you create 4-dimensional textures
The r component is used to specify either the depth in a 3D (volumetric) texture, or the layer in a 2D texture array.
The q component plays the same role, like the vertex position w element: It is used for scaling the perspective divide in perspective texture projection.
There isn't any real "meaning" to them. If you were using shaders, you can assign any meaning you want to them.
For example, in our game: we used the xy for the actual texcoords, the z for which texture to sample from, and the w (4th component) to control the brightness.
There is such thing as 3D and 4D textures which do actually require 3 and 4 texcoords respectively, I suppose that could be the "meaning" of them.
The main reason that they exist, is because graphics cards work with 4 component vectors. When you pass a 2D texcoord in, it's still a 4-vector behind the scenes (the other r and q components aren't set). OpenGL provides you with the functionality to use them, on the off chance that you might need it.
The r component is the 3rd coordinate for GL_TEXTURE_3D (for rendering volumes). I am not familiar with any method that uses the 4th coordinate.
But it seems reasonable to have that available as all homogeneous OpenGL vectors have 4 components.
There is no such thing as a 4-dimensional texture. At least, not without extensions.
The reason glTexCoord4D exists is to allow passing 4 values. In the modern shader-based rendering world, "texture coordinates" don't have to be texture coordinates at all. They're just values the shader uses to do whatever it does.
Many of the texture lookup functions in shaders take more texture coordinate dimensions than the dimensionality of the actual texture. All texture functions for shadow textures take an extra coordinate, which represents the comparison value. All of the Proj texture functions take an extra coordinate, which represents the homogeneous coordinate for a homogeneous coordinate system.
In fixed-function land, 4D texture coordinates can be used for projective texturing of 3D textures. So the 4D coordinate is in a homogeneous coordinate system.