I recently posted an article on gamedev.net
describing a nice fast
method to quantize unit vectors.
A few weeks later I received an email from
Oleg D. with
a really excellent improvement to my quantization technique. The
new method is only slightly slower and gives a much nicer quantization.
In my original method the resulting sphere had a slight crease along the
equator where the quantization error was worst. In the new method there is
less error overall, and more importantly the quantization error is
spread more evenly, eliminating the equatorial crease.
In the first image is a sphere lit using full 12 byte normals. The second image
is the same sphere lit using the improved quantization method. You can see
that both of them look perfectly spherical. There is no visible loss of
accuracy from the new quantization method. By comparison the old quantization
method shows a crease along the equator. I have included code for the new
quantization method at the bottom of this article. Here is a quick explanation of
how this improved quantization method works, borrowing extensively from Oleg's
own explanation of the algorithm.
|
In my old quantization method I was basically projecting a sphere onto the XY
plane. Therefore the quantization method was pretty good in places where the
sphere surface is not perpendicular to the plane, but gets progressively worse
as the surface becomes perpendicular to the projection plane. You can see in the diagram
that normals in the red area map to a much smaller part of the plane than normals
in the green area. More specificaly adding 2 bits to the sample size generally
increases the accuracy by a factor 4 but near the edges the accuracy increases
only twice.
One way to avoid this effect is to use another type of projection. By selecting
a radial projection on the plane that goes through points X0=(1,0,0),Y0={0,1,0)
and Z0=(0,0,1). That is, one casts a ray from the origin to the given point on
the sphere and finds the intersection between the ray and the plane. To simplify
things even further the new method uses a non-orthogonal coordinate system on the
plane such that X0->(0,0), Y0->(1,0), Z0->(0,1)
Since we only consider the octant where x,y,z > 0, the sphere is nowhere
perpendicular to the plane. In fact the maximum angle between the sphere and
the plane is around 30 degrees. In this case the projection is everywhere nice
and regular and each bit of the sample always corresponds to a 2-fold increase
in accuracy. Calculations show that the spread between the maximum accuracy and
the minimal accuracy is only about 5 times.
The formula for the projective transform is:
zbits = 1/3 * (1 - (x+y-2*z)/(x+y+z))
ybits = 1/3 * (1 - (x-2*y+z)/(x+y+z))
Which we can reduce to:
zbits = z / ( x + y + z )
ybits = y / ( x + y + z )
The ybits and zbits are now between 0 and 1, and so is ( ybits + zbits ).
Thus we can apply a similar bit-packing method as in the original algorithm.
In the original method since the diagonal of the table was all zero, we could
quantize two values, x and y, from 0-127, since the diagonal could be shared in
the two encodings. Since in this mapping the diagonal varies, we have to
quantize two values ybits and zbits from 0-126 so that we code nothing on the
diagonal.
To unpack we use the reverse transform and then normalize the vector so it
again lies on the sphere. Since normalization is just a scaling by some number,
we can pre-tabulate the amount of required scaling for each u and v and thus
avoid costly divisions and square roots. Logically what we do is:
scaling = scalingTable[zbits][ybits]
v.z = zbits * scaling
v.y = ybits * scaling
v.x = (1 - zbits - ybits ) * scaling
There is another way to understand this new projection. We know the length of
unit vectors is always 1, so we don't have to store that information. We can store any
vector that is colinear with the normal as the quantized vector and then just
normalize the vector once we unpack it ( by multiplying it with a stored
normalization constant ). So all we need to do is store the ratio of say
x/(x+y+z) and y/(x+y+z) as our two quantized values. Any ratios that allow us
to resconstruct the proportions of x,y,z will do. Since we are quantizing the
values in the range 0-126 just do the following:
w = 126 / ( x + y + z )
xbits = x * w;
ybits = y * w;
Now xbits is the ratio of x/(x+y+z) quantized in the range of 0-126, and ybits
is the ratio y/(x+y+z). Given these two ratios we can unpack a vector colinear
with the original vector like so:
x = xbits
y = ybits
z = 126 - x - y
Finally to get the original normal back, we just multiply the unpacked vector
by the aproporiate scaling constant:
x *= scaling;
y *= scaling;
z *= scaling;
A quick review should convince you that both methods are in fact identical.
This new method has a slightly higher cost. To pack a vector from 12 bytes to
2 bytes the old method used 4 multiplies and 2 adds, this new method requires
2 multiplies, 2 adds and a divide, which on modern processors may be almost the
same speed. The unpacking method costs 3 extra multiplies and two adds. Also
no big deal. And look at that near perfect accuracy, at least for lighting.
This method is being used to store compressed normals in
Valve Software's Half-Life 2 and their
Half-Life2 SDK.
Below is a reimplementation of the original quantized vector class with the
improved quantization method.
#ifndef _3D_UNITVEC_H
#define _3D_UNITVEC_H
#include
#include
#include "3dmath/vector.h"
#define UNITVEC_DECLARE_STATICS \
float cUnitVector::mUVAdjustment[0x2000]; \
c3dVector cUnitVector::mTmpVec;
// upper 3 bits
#define SIGN_MASK 0xe000
#define XSIGN_MASK 0x8000
#define YSIGN_MASK 0x4000
#define ZSIGN_MASK 0x2000
// middle 6 bits - xbits
#define TOP_MASK 0x1f80
// lower 7 bits - ybits
#define BOTTOM_MASK 0x007f
// unitcomp.cpp : A Unit Vector to 16-bit word conversion
// algorithm based on work of Rafael Baptista (rafael@oroboro.com)
// Accuracy improved by O.D. (punkfloyd@rocketmail.com)
// a compressed unit vector. reasonable fidelty for unit
// vectors in a 16 bit package. Good enough for surface normals
// we hope.
class cUnitVector : public c3dMathObject
{
public:
cUnitVector() { mVec = 0; }
cUnitVector( const c3dVector& vec )
{
packVector( vec );
}
cUnitVector( unsigned short val ) { mVec = val; }
cUnitVector& operator=( const c3dVector& vec )
{ packVector( vec ); return *this; }
operator c3dVector()
{
unpackVector( mTmpVec );
return mTmpVec;
}
void packVector( const c3dVector& vec )
{
// convert from c3dVector to cUnitVector
assert( vec.isValid());
c3dVector tmp = vec;
// input vector does not have to be unit length
// assert( tmp.length() <= 1.001f );
mVec = 0;
if ( tmp.x < 0 ) { mVec |= XSIGN_MASK; tmp.x = -tmp.x; }
if ( tmp.y < 0 ) { mVec |= YSIGN_MASK; tmp.y = -tmp.y; }
if ( tmp.z < 0 ) { mVec |= ZSIGN_MASK; tmp.z = -tmp.z; }
// project the normal onto the plane that goes through
// X0=(1,0,0),Y0=(0,1,0),Z0=(0,0,1).
// on that plane we choose an (projective!) coordinate system
// such that X0->(0,0), Y0->(126,0), Z0->(0,126),(0,0,0)->Infinity
// a little slower... old pack was 4 multiplies and 2 adds.
// This is 2 multiplies, 2 adds, and a divide....
float w = 126.0f / ( tmp.x + tmp.y + tmp.z );
long xbits = (long)( tmp.x * w );
long ybits = (long)( tmp.y * w );
assert( xbits < 127 );
assert( xbits >= 0 );
assert( ybits < 127 );
assert( ybits >= 0 );
// Now we can be sure that 0<=xp<=126, 0<=yp<=126, 0<=xp+yp<=126
// however for the sampling we want to transform this triangle
// into a rectangle.
if ( xbits >= 64 )
{
xbits = 127 - xbits;
ybits = 127 - ybits;
}
// now we that have xp in the range (0,127) and yp in
// the range (0,63), we can pack all the bits together
mVec |= ( xbits << 7 );
mVec |= ybits;
}
void unpackVector( c3dVector& vec )
{
// if we do a straightforward backward transform
// we will get points on the plane X0,Y0,Z0
// however we need points on a sphere that goes through
// these points. Therefore we need to adjust x,y,z so
// that x^2+y^2+z^2=1 by normalizing the vector. We have
// already precalculated the amount by which we need to
// scale, so all we do is a table lookup and a
// multiplication
// get the x and y bits
long xbits = (( mVec & TOP_MASK ) >> 7 );
long ybits = ( mVec & BOTTOM_MASK );
// map the numbers back to the triangle (0,0)-(0,126)-(126,0)
if (( xbits + ybits ) >= 127 )
{
xbits = 127 - xbits;
ybits = 127 - ybits;
}
// do the inverse transform and normalization
// costs 3 extra multiplies and 2 subtracts. No big deal.
float uvadj = mUVAdjustment[mVec & ~SIGN_MASK];
vec.x = uvadj * (float) xbits;
vec.y = uvadj * (float) ybits;
vec.z = uvadj * (float)( 126 - xbits - ybits );
// set all the sign bits
if ( mVec & XSIGN_MASK ) vec.x = -vec.x;
if ( mVec & YSIGN_MASK ) vec.y = -vec.y;
if ( mVec & ZSIGN_MASK ) vec.z = -vec.z;
assert( vec.isValid());
}
static void initializeStatics()
{
for ( int idx = 0; idx < 0x2000; idx++ )
{
long xbits = idx >> 7;
long ybits = idx & BOTTOM_MASK;
// map the numbers back to the triangle (0,0)-(0,127)-(127,0)
if (( xbits + ybits ) >= 127 )
{
xbits = 127 - xbits;
ybits = 127 - ybits;
}
// convert to 3D vectors
float x = (float)xbits;
float y = (float)ybits;
float z = (float)( 126 - xbits - ybits );
// calculate the amount of normalization required
mUVAdjustment[idx] = 1.0f / sqrtf( y*y + z*z + x*x );
assert( _finite( mUVAdjustment[idx]));
//cerr << mUVAdjustment[idx] << "\t";
//if ( xbits == 0 ) cerr << "\n";
}
}
void test()
{
#define TEST_RANGE 4
#define TEST_RANDOM 100
#define TEST_ANGERROR 1.0
float maxError = 0;
float avgError = 0;
int numVecs = 0;
{for ( int x = -TEST_RANGE; x < TEST_RANGE; x++ )
{
for ( int y = -TEST_RANGE; y < TEST_RANGE; y++ )
{
for ( int z = -TEST_RANGE; z < TEST_RANGE; z++ )
{
if (( x + y + z ) == 0 ) continue;
c3dVector vec( (float)x, (float)y, (float)z );
c3dVector vec2;
vec.normalize();
packVector( vec );
unpackVector( vec2 );
float ang = vec.dot( vec2 );
ang = (( fabs( ang ) > 0.99999f ) ? 0 : (float)acos(ang));
if (( ang > TEST_ANGERROR ) | ( !_finite( ang )))
{
cerr << "error: " << ang << endl;
cerr << "orig vec: " << vec.x << ",\t"
<< vec.y << ",\t" << vec.z << "\tmVec: "
<< mVec << endl;
cerr << "quantized vec2: " << vec2.x
<< ",\t" << vec2.y << ",\t"
<< vec2.z << endl << endl;
}
avgError += ang;
numVecs++;
if ( maxError < ang ) maxError = ang;
}
}
}}
for ( int w = 0; w < TEST_RANDOM; w++ )
{
c3dVector vec( genRandom(), genRandom(), genRandom());
c3dVector vec2;
vec.normalize();
packVector( vec );
unpackVector( vec2 );
float ang =vec.dot( vec2 );
ang = (( ang > 0.999f ) ? 0 : (float)acos(ang));
if (( ang > TEST_ANGERROR ) | ( !_finite( ang )))
{
cerr << "error: " << ang << endl;
cerr << "orig vec: " << vec.x << ",\t"
<< vec.y << ",\t" << vec.z << "\tmVec: "
<< mVec << endl;
cerr << "quantized vec2: " << vec2.x << ",\t"
<< vec2.y << ",\t"
<< vec2.z << endl << endl;
}
avgError += ang;
numVecs++;
if ( maxError < ang ) maxError = ang;
}
{ for ( int x = 0; x < 50; x++ )
{
c3dVector vec( (float)x, 25.0f, 0.0f );
c3dVector vec2;
vec.normalize();
packVector( vec );
unpackVector( vec2 );
float ang = vec.dot( vec2 );
ang = (( fabs( ang ) > 0.999f ) ? 0 : (float)acos(ang));
if (( ang > TEST_ANGERROR ) | ( !_finite( ang )))
{
cerr << "error: " << ang << endl;
cerr << "orig vec: " << vec.x << ",\t"
<< vec.y << ",\t" << vec.z << "\tmVec: "
<< mVec << endl;
cerr << " quantized vec2: " << vec2.x << ",\t"
<< vec2.y << ",\t" << vec2.z << endl << endl;
}
avgError += ang;
numVecs++;
if ( maxError < ang ) maxError = ang;
}}
cerr << "max angle error: " << maxError
<< ", average error: " << avgError / numVecs
<< ", num tested vecs: " << numVecs << endl;
}
friend ostream& operator<< ( ostream& os, const cUnitVector& vec )
{ os << vec.mVec; return os; }
protected:
unsigned short mVec;
static float mUVAdjustment[0x2000];
static c3dVector mTmpVec;
};
#endif // _3D_VECTOR_H