The Mathematics of Three-Dimensional Manipulations and Transformations. ---------------------------------------------------------------------------Edition 1.2, by Trip V. Reece, June 1992. e-mail:
[email protected] cis: Trip V. Reece 70620,2371 Permission is given to copy and distribute without alteration. Table of Contents ----------------I.
Mapping 3d onto 2d A. Skrinkage of X,Y dimensions B. Vanishing Point @ C. Aspect Ratio
II.
Translation and Scaling A. The easy way B. Via Matrices
III.
Rotation A. Rotating a 2d coordinate B. Rotation around another point C. Rotating a 3d coordinate D. The Matrix Solution
IV.
Matrices Unveiled A. The Affine Transformation Generalized B. Choices, Choices
V.
Z-Buffering A. Why? B. How?
VI.
Techniques A. Bresenham's Drawing algorithm
I.
Mapping Three-Dimensional coordinates onto a Two-Dimensional Plane
Although the title for this entry appears stultifying, indeed formidable, it is little other than identifying the relationships between distance and perspective. The most important concept in 3d display is the apparent shrinkage of objects as they recede. The other concept to keep in mind is the idea of a vanishing point. It would be easier to simply transcribe the formula here, but to properly implement a 3-dimensional system, a thorough knowledge and understanding of EACH aspect of the system is VITAL. Otherwise, how can you track down slow operations and optimize efficiently? The goal of this mapping function is to be able to pass it the three-dimensional coordinates and have it return the two dimensional coordinates for the monitor screen. To understand how the mapping works, try to visualize a transparent, flat, rigid, plastic sheet with dots marked on it in a regular pattern, like so: . . . . . . . . . . . . . . . . . . . . Now, imagine each of these dots to be 3cm from the dot above/below or to the left or right of it. Imagine now, if you will, another, perfectly identical sheet of plastic exactly 3cm from the first, parallel to the first and further away from you than the first. Now, imagining how it would look,
you would notice that the X and Y dimensions are contracted with greater intensity the further away a sheet of plastic is from you. The marks on the plastic appear to be "pulled in" to the point directly in front of your eyeball, contracted >towards< that point. The first concept, that of diminishing size with distance is given by the relationship: Size (= Original Size / Distance Note: The (= symbol is my ascii approximation for an alpha, the symbol for >proportional toalong< the X, Y, and Z axes. Rotation is another matter entirely.
III.
Rotation
Since 3d geometry is really no different from 2d geometry, except with regards to the complexity of visualizing the math, rotations in 3d are similarly rooted in 2d geometry. Warning: If you haven't had trigonometry experience, you will not understand this- read a book on it. Now, everything in digital computers is usually expressed in integers or in rounded off floating-point numbers. Coordinates of objects in either 2 or 3 dimensions is usually done in Cartesian, or Rectangular coordinates. This means we have distinct values for X, Y (and/or Z) that are unique to that point. In dealing with rectangular coordinates, when we want to rotate a point about the origin (0,0) in two-dimensions, we would convert the X,Y coordinate into an (R,Theta) polar coordinate, add the angle we wish to rotate the point to Theta, and then convert (R,New_Theta) back into rectangular. This is honestly one of the quickest methods to rotate a single point about the origin. The following identities are useful, indeed >necessary< to compute the rotated X and Y values. ___________ R = \/ X^2 + Y^2 or: R = ( X^2 + Y^2 ) ^ 0.5 X = R * Cos (Theta) Y = R * Sin (Theta) Tangent ( Y / X ) = Theta Theta = ArcTangent ( Y / X ) Sine ( Y / R ) = Theta Theta = ArcSin ( Y / R ) Cosine ( X / R ) = Theta Theta = ArcCos ( X / R ) The value of R remains constant during the rotation. the value of R then.
We first compute
R = sqrt ( Old_X * Old_X + Old_Y * Old_Y ) Now, we must compute the OLD value of theta. Theta = ArcTan ( Old_Y / Old_X ) Now, we must compute the new values of X and Y. New_X = R * Cos ( Theta + RotateAngle ) New_Y = R * Sin ( Theta + RotateAngle ) Ahh, but wait! This will not work all the time! Why not? Because when Old_X is negative, the value of Theta will be the value of the reference angle, NOT the angle from the origin! A simple fix is to use the following code instead: New_X = R * Cos ( Theta + RotateAngle ) If Old_X < 0 then New_X = New_X + Pi -or- if you are working in degrees: If Old_X < 0 then New_X = New_X + 180 and not forgetting to include:
New_Y = R * Sin ( Theta + RotateAngle ) The conversion between radians (it has pi in it) into degrees is to multiply the radian measure by ( 180 / Pi ) and to convert degrees into radians, you multiply the degree measue by ( Pi / 180 ). What this means is that 2*Pi radians equals 360 degrees, 180 degrees equals Pi radians, and 0 degrees equals 0 radians. When you have a negative Old_X value, the value of Theta returned by ArcTangent ( Y / X ) is the angle measured from the left handed side of the x axis, this angle is the angle measured DOWN from the left side, not from the right side of the x axis. Adding Pi radians (180 degrees) gives us the angle from the right side-- a positive value of Y will cause the ArcTangent function to return a negative angle. This negative angle is the angle UP from the left side of the x axis; adding 180 degrees gives us the angle from the right side. When X >and< Y are negative, the ArcTangent function gives us a positive angle, the angle DOWN from the left side of the x axis. Adding 180 degrees clearly gives us the angle from the right side. Remember now, that angles are measured CounterClockwise from the right side of the x axis to the angle Theta. If we did not know the original coordinates of X and Y, we would not be able to determine the correct value of Theta because the ArcTangent function only returns values between -90 degrees and +90 degrees (or -Pi/2 radians to +Pi/2 radians.) Suppose we didn't want to rotate the point about the origin, but about another point on the X,Y plane? Simplicity itself! Look at the rotation as a rotation about the origin, only the origin is displaced by the coordinates of the point about which you are really rotating (re-read that sentence if necessary.) Call that second point the origin, and you're set. If you make the point you want to rotate relative to the origin the same as it is relative to the second point, then translate it back, you've got it. So, if we call the point about which we want to rotate Rot_X, Rot_Y then we are left with: Old_X2 = Old_X1 - Rot_X Old_Y2 = Old_Y1 - Rot_Y Now, perform the rotation with Old_X2 and Old_Y2 in place of Old_X and Old_Y respectively. Then, after you finish that, translate it back with: New_X2 = New_X1 + Rot_X New_Y2 = New_Y1 + Rot_Y Now, if that isn't enough obfuscation, I shall write out the complete rotation with all variables in their proper place: R = ( (Old_X - Rot_X)^2 + (Old_Y - Rot_Y)^2 ) ^ 0.5 Theta = ArcTangent ( Y / X ) If (Old_X - Rot_X) < 0 then Theta = Theta + Pi New_X = R * Cos (Theta + RotateAngle) + Rot_X New_Y = R * Sin (Theta + RotateAngle) + Rot_Y To re-iterate this, RotateAngle is that angle in RADIANS COUNTERCLOCKWISE that you wish to rotate the original coordinate, (Old_X,Old_Y), about the point (Rot_X,Rot_Y). To implement this in a 3-dimensional frame, simply define what axis about which to wish to rotate a point (or group of points, as in an object) and in place of X and Y put the two axes that are left (NOT the axis about which you are rotating) in their place. Now, the Rot_X and Rot_Y coordinates are the coordinates of the LINE that you are rotating the point(s) about. If you wanted to rotate an object about the origin, leave Rot_x and Rot_Y zero. If you wish to rotate the object about it's center but along the Z axis, calculate the average of all X values and all Y values for that object and substitute those in for Rot_X and Rot_Y. Remember that you can also rotate in the X or Y directions also! If you wished to rotate an object about the X
axis, around the origin for the X axis you would make Rot_X, Rot_Y equal to the Z and Y coordinates of the X axis: ZERO. You would then put the Z and Y dimension coordinates in place of the values of Old_X and Old_Y (they're just variable identifiers, mathematically, you may rotate about any axis.) If you mix up the order of Z and Y, (Frankly, I'm nt even sure about the "correct" order.) then the only side-effect will be that your rotations are reversed in direction. It's all a matter of perspective, >you< decide what direction you wish to look head-on for each of the axes. In my examples, I'm assuming the Z axis goes into the screen and X and Y are left/right and top/bottom respectively. --Matrices How is this solved with matrices? It is not as easy to rotate about another point with matrices, but rotating about a single axis is relatively straightforward. Consider the matrix*matrix problem below: Rotation about X axis: | | | |
1 0 0 0
0 cT -sT 0
0 sT -cT 0
0 0 0 1
| | 1 | * | x | | y | | z
| | | |
=
| | | |
A B C D
| | | |
[ cT = Cos(Theta), sT = Sin(Theta) ] If you go ahead with the matrix multiplication (I'll write it all out for those who are rusty this first time,) then you get a result of: A B C D
= = = =
1*1 0*1 0*1 0*1
+ + + +
0*x + 0*y + 0*z = 1 cos(Theta)*x + sin(Theta)*y + 0*z = x*cos(Theta) + y*sin(Theta) -sin(Theta)*x + -cos(Theta)*y + 0*z = x*-sin(Theta) - y*cos(Theta) 0*x + 0*y + 1*z = z
A = 1, B= x*cos(Theta)+y*sin(Theta), C= -x*sin(Theta)-y*cos(Theta), D= z New_X = B New_Y = C New_Z = D This is the rotation of a point (x,y,z) about the X axis, for Theta degrees. I'd guess this is a correct rotation... But the value of C seems to be incorrect. However, I will stick to non-matrix calculations to eliminate all those unnecessary ???*0 + ???*0 + ???*1 operations. [ I consent that these transforms make no sense to me. ]
Rotation about Y axis: | cT | 0 | -sT | 0
0 1 0 0
-sT 0 cT 0
0 0 0 1
| | 1 | * | x | | y | | z
| | | |
=
| | | |
A B C D
| | | |
In this case: A B C D
= = = =
1*cos(Theta) + 0*x + y*-sin(Theta) + 0*z = cos(Theta) - y*sin(Theta) 0*1 + 1*x + 0*y + 0*z = x 1*-sin(Theta) + 0*x + y*cos(Theta) + 0*z = -sin(Theta) + y*cos(Theta) 0*1 + 0*x + 0*y + 1*z = z
Again, this really looks pretty wrong. Perhaps I'm not multiplying matrices correctly, or maybe the matrices are set up wrongly. However, if it worksuse it.
Rotation about Z axis:
| cT | -sT | 0 | 0
sT cT 0 0
0 0 1 0
0 0 0 1
| | 1 | * | x | | y | | z
| | | |
=
| | | |
A B C D
| | | |
In this case: A B C D
= = = =
1*cos(Theta) + x*sin(Theta) + 0*y + 0*z = cos(Theta) + x*sin(Theta) -sin(Theta)*1 + x*cos(Theta) + 0*y + 0*z = -sin(Theta) + x*cos(Theta) 0*1 + 0*x + 1*y + 0*z = y 0*1 + 0*x + 0*y + 1*z = z
Now, this may make some sense. Rotating about the z axis should only affect the x and y coordinates. Hmm, in this case A and B >must< hold the new X and Y coordinates, since C contains an unchanged value of y. Well, that's all about rotation with matrices that I'm prepared to be flamed for. [ I honestly don't understand how this works, if it does at all. ]
IV.
Matrices Unveiled
This section is postponed until I can get some accurate information about the matrices for rotation and other affine transformations.
V.
Z-Buffering - The how's and the why's.
Okay. translation and rotation may be all fine and dandy, but suppose I want to display my teapot that I've got encoded in a very nice compact data structure, but it looks transparent. In fact, it looks like a wireframe, which it is! -How to go about Z-BufferingFirst, get the memory. Wherever possible, grab memory. You will need for this exercise: enough memory for 3 times the video resolution in pixels, for example: 320x200 = 64000. You will need 320x200x3 = 192000 bytes for this, unless you choose to perform some in-memory compression, an adivsable technique! This assumes you are using 8-bit color, i.e. 256 color. First, reserve one byte for each pixel on the screen as part of a "virtual" screen. This byte will store the color for that pixel once the 3d mapping and depth checking is complete. Initialize this array with the background color of your choice. Next, reserve two bytes (using the type integer, for example) for each pixel on the virtual screen in another array (yes this spans two segments of memory or more.) These two bytes store the distance of the pixel in the range from 0 .. 65535. This is the Z-Buffer itself. In actuality, it shouldn't be called a Z buffer in a perspective rendering modeller. It is properly a Distance-Buffer, but in an orthographic environment, the depth is the same as the Z coordinate, so the name stuck. Oh well. Initialize this array of distances with the value of 65535, or whatever your yon/hither values dictate. It is a good idea to set a maximum distance on the objects you render, as well as a minimum distance- this prevents unnecessary calculations that would end up in an object rendered that only appeared as three small pixels not worthy of being called a polygon, or an impossibly huge triangle (for example) that covers up everything else on the screen. Use your judgement. To draw a single frame now: Scroll through your list of objects that are in visible sight. Calculate the distance of each vertex from your eyeball using the pythagorean distance formula: ______________________ Distance = \/ X^2 + Y^2 + Z^2 or:
Distance =
( X^2 + Y^2 + Z^2 ) ^ 0.5
Now, calculate the coordinates of the mapped point on the 2d plane for this vertex. (This should account for the vanishing point, as well as perspective.) Now, compare the value of this distance with the value of the distance contained in the array of distances at the pixel location you just found the 2d mapped coordinates of. If this vertex is of a less distance then replace the byte in the array of color with the color of this pixel, and store the calculated distance into the "Z-Buffer" array of distances at the location where you found it to lie on the 2d plane. The gist of this all is to compare the distance of a point with the distance found in the array, if the distance already in the array is less than the distance of the pixel we are checking, then throw out that new pixel. A pixel closer to you will "cover up" a pixel that is farther away. To Z-Buffer a polygon, the vertices of the polygon must be projected onto the Z-Buffer plane, and checked for overlapping. Even if the vertices aren't visible, there may be points on the polygon that >are< visible due to a "hole" in another polygon that just happens to be blocking the rest of the polygon we are rendering. Instead of calculating the distance to each pixel of the polygon (entailing far too many squareds and square root operations to be feasible on most 80x86's) a form of interpolation may be used provided the four vertices and their distances; and upon the condition that the polygon is perfectly flat. Of course Z-Buffering can also apply to non-polygonal shapes, however, it radically increases rendering time to use non-polygons. After all the shapes/polygons have been rendered onto the virtual screen, the screen of visible colors must be copied into the video area. This can be quickened by storing the color array in video memory to begin with, and page in the new screens as they are available. Be sure not to have the video screen showing that you are currently Z-Buffering! That would give away the secret! :-) Another improvement to make that is simplified when Z-Buffering is shading polygons. Simply calculate the angle between the polygon normal (the perpendicular to the polygon's surface) and the light source (a "global" value) and take the cosine of this angle to get the intensity of the light reflected off the polygon. Be sure to multiply the value of the cosine by the maximum allowable intensity, since cosine returns values ranging from -1,...+1. Negative values either should be negated to get the positive value, or it could signify a hidden surface, i.e. a surface that faces away from the viewer.
VI.
Techniques and Algorithms Bresenham's Line Drawing Algorithm -The importance of this algorithm is evident when it is implemented with ONLY integer math. This can drastically improve calculation times.
This function takes in the X and Y coordinates of the endpoints of the line to be drawn. To make it more versatille, it also will accept the color of the points it is to plot. The whole trick behind how this works is based on the mathematical fact that multiplication is really only repeated addition, and division is really only repeated subtraction. There are two cases for the line drawing algorithm, 1: a line with a slope >= 1, and 2: a line with a slope < 1. Since slope is (delta Y)/(delta X) this means that the algorithm is dependent upon the fact that deltaY or delta X is the larger of the two. Let's consider case 2, where delta X exceeds delta Y. The algorithm "follows" along the X-axis, incrementing the x value of the line it plots by one each time, and it calculates the value of y on each run through the x increment loop. A variable which could be called "Cycle" is first initialized to the value of one half of delta X. This
variable, Cycle will be incremented by the value of delta Y and then checked against the value of delta X. If cycle is greater than delta X then the value of the y-coordinate to be plotted is upped by one (assuming the line's slope is positive) and the value of Cycle has the value of delta X subtracted from it. The value of the x coordinate of the point where the algorithm plots the new point each run through the loop is incremented by one, unconditionally. Now, another run through the loop will give us: increment Cycle by delta Y, check to see if it's greater than delta X, (If YES: Cycle=cycle-delta X and Y_plot=Y_plot+1, else If NO: then don't change Y_plot.) Now, X_plot=X_plot+1. Perhaps an table of the values of Cycle, X_plot, & Y_plot as the loop is executed will better explain what is happening. The values passed to the algorithm are called x1, y1, x2, y2. delta Y = y2 - y1 = 4 delta X = x2 - x1 = 9
( arbitrary points chosen for clarity )
Cycle | X_Plot | Y_Plot -----------+------------+-----------4 | x1 | y1 4+4= 8 | x1+1 | y1 8+4=12-9=3 | x1+2 | y1+1 3+4= 7 | x1+3 | y1+1 7+4=11-9=2 | x1+4 | y1+2 2+4= 6 | x1+5 | y1+2 6+4=10-9=1 | x1+6 | y1+3 1+4= 5 | x1+7 | y1+3 5+4=9-9= 0 | x1+8 | y1+4 0+4= 4 | x1+9 | y1+4
( y1 is incremented because cycle overflowed delta X ) ( decreasesthis< array and increment Y_Plot (or X_Plot as the case may be) where a one occurs. And if you have the Megs and Megs of memory available, you could precalculate each of these arrays for every possible combination of delta Y and delta X... Maybe that's going a bit far now- :-). Why use the Bresenham line drawing algorithm? It's blazingly fast. Consider the work needed in dividing half a zillion line slope calculations, in real-time. This routine only requires addition, subtraction, and a check. The intel processors have been optimized to all hell for wicked speed at addition/subtraction/comparisons, whereas division will ALWAYS be slow slow slow... ... More algorithms to come! ...
Note: I intend this article to be a growing text to help decipher the cryptic world of 3d, as I learn more I will add to this list. I also know that I, like most people of the younger generation, am prone to errors. Tell me about them! I want to learn more about this, can't find or afford most of the books on this subject, so anything you can tell me will be appreciated! and included if it makes any sort of sense!
-Trip V. Reece Any comments, suggestions, errata reports please email me. e-mail:
[email protected]