I would like to use an array index operator in a 3D vector class, as an alternative to accessing its x, y, z members directly, as I dont like the function syntax as used in std::complex(e.g complex.real() , vect.x() etc).
Disregrading the fact that the array index operator in the following vect class might technically use undefined behaviour, nevertheless there isnt a way it can fail (assuming a valid index) , or is there?
regards Andy Little
#include <stdexcept> #include <cassert>
template <typename T> struct vect { T x,y,z;
T & operator[](int n) { if((n < 0)|| (n > 2)){ throw std::logic_error("array subscript out of range"); } if( sizeof(vect) == 3 * sizeof(T)){ T * p = & x; return p[n]; } else{ switch (n){ case 0: return x; case 1 : return y; default: return z; } } } T const & operator[](int n)const { if( (n < 0) || (n > 2)){ throw std::logic_error("array subscript out of range"); } if( sizeof(vect) == 3 * sizeof(T)){ T const * p = & x; return p[n]; } else{ switch (n){ case 0: return x; case 1 : return y; default: return z; } } } vect(): x(0),y(0),z(0){} /* Other operations omitted for clarity
...
*/ };
int main() { vect<double> v; v[0] = 2; double n = v.x; assert(v[0] == n);
kwikius wrote: > I would like to use an array index operator in a 3D vector class, as > an alternative to accessing its x, y, z members directly, as I dont > like the function syntax as used in std::complex(e.g complex.real() , > vect.x() etc).
> Disregrading the fact that the array index operator in the following > vect class might technically use undefined behaviour, nevertheless > there isnt a way it can fail (assuming a valid index) , or is there?
> T & operator[](int n) > { > if((n < 0)|| (n > 2)){ > throw std::logic_error("array subscript out of > range"); } > if( sizeof(vect) == 3 * sizeof(T)){
This is suspect. 'sizeof(vect)' is not necessarily 3 * sizeof(T) even though the only three members are the three T's in it. Generally speaking, there can be padding of indeterminate size.
> T * p = & x; > return p[n]; > } > else{
[..]
V -- Please remove capital As from my address when replying by mail
On 2006-05-18, kwikius <a...@servocomm.freeserve.co.uk> wrote:
> Disregrading the fact that the array index operator in the following > vect class might technically use undefined behaviour, nevertheless > there isnt a way it can fail (assuming a valid index) , or is there?
I'm not sure, but I suspect trouble looms ahead. See below.
I don't see why you have the sizeof-if at all since the switch-case does the trick in a safe fashion and efficiently. The compiler ought to figure out how to make the switch-case structure efficient here. Assuming you still want that sizeof-if, I'm not convinced of it's safety.
AFAIK it is guaranteed that things are allocated in the order of declaration inside struct so that in a type struct bof { int a; int b; }; a is always allocated before b in memory. I'm not sure, however, if this is the case for struct bof { int a,b; }; If not, then T* p = &x; will obviously be a problem.
Also, is the compiler allowed to allocate extra space between the members of a struct? IMHO it is legal and may happen. If so, you have problems once again.
Then there's the additional issue that someone might pass your template a suprising type but I think we may leave this possibility out of the discussion.
kwikius <a...@servocomm.freeserve.co.uk> wrote: > I would like to use an array index operator in a 3D vector class, as an > alternative to accessing its x, y, z members directly, as I dont like > the function syntax as used in std::complex(e.g complex.real() , > vect.x() etc).
> Disregrading the fact that the array index operator in the following > vect class might technically use undefined behaviour, nevertheless > there isnt a way it can fail (assuming a valid index) , or is there?
[snip] why not just T &operator [](int n) { switch(n) { case 0: return x; case 1: return y; case 2: return z; default:throw std::logic_error(...); } } similiar for const version.
Some time ago I've seen an interesting technique to implement indexing for vector-like classes with fixed elements count. It simplifies code (both implementation and usage), but can slow down things if your compiler is not intelligent enough.
#include <stdexcept>
template <typename T> struct vect { struct { // to init by zeroes automatically (saves you from writing of initialization list for default constructor) union { T data_[3]; struct { T x,y,z; // saves you from vec.x() syntax }; }; };
T& operator[](int n) { if((n < 0)|| (n > 2)){ throw std::logic_error("array subscript out of range"); } return data_[n]; } // NOTE: return by value for simple built-in types is better T operator[](int n)const { if( (n < 0) || (n > 2)){ throw std::logic_error("array subscript out of range"); } return data_[n]; } vect() {} /* Other operations omitted for clarity
Carl Barron wrote: > In article <1147911942.723852.258...@y43g2000cwc.googlegroups.com>, > kwikius <a...@servocomm.freeserve.co.uk> wrote: > > I would like to use an array index operator in a 3D vector > > class, as an alternative to accessing its x, y, z members > > directly, as I dont like the function syntax as used in > > std::complex(e.g complex.real() , vect.x() etc). > > Disregrading the fact that the array index operator in the > > following vect class might technically use undefined > > behaviour, nevertheless there isnt a way it can fail > > (assuming a valid index) , or is there? > > template <typename T> > > struct vect > > { > > T x,y,z; > > T & operator[](int n) > > { > > if((n < 0)|| (n > 2)){ > > throw std::logic_error("array subscript out of range"); > > } > > if( sizeof(vect) == 3 * sizeof(T)){ > > T * p = & x; > > return p[n]; > > } > > else{ > > switch (n){ > > case 0: > > return x; > > case 1 : > > return y; > > default: > > return z; > > } > > } > > } > [snip] > why not just > T &operator [](int n) > { > switch(n) > { > case 0: return x; > case 1: return y; > case 2: return z; > default:throw std::logic_error(...); > } > } > similiar for const version.
I think that's the real question. I don't think his version will fail in any reasonable implementation, but I don't see the point of it. Your solution is far easier to read and understand, will result in a smaller program, and will probably be just as fast.
-- James Kanze GABI Software Conseils en informatique orientée objet/ Beratung in objektorientierter Datenverarbeitung 9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Victor Bazarov wrote: > kwikius wrote: > > I would like to use an array index operator in a 3D vector class, as > > an alternative to accessing its x, y, z members directly, as I dont > > like the function syntax as used in std::complex(e.g complex.real() , > > vect.x() etc).
> > Disregrading the fact that the array index operator in the following > > vect class might technically use undefined behaviour, nevertheless > > there isnt a way it can fail (assuming a valid index) , or is there?
> This is suspect. 'sizeof(vect)' is not necessarily 3 * sizeof(T) even > though the only three members are the three T's in it. Generally speaking, > there can be padding of indeterminate size.
OK, but if 'sizeof(vect)' is not 3 * sizeof(T), then the offending code is unreachable, so it isnt a problem AFAICS.
> template <typename T> > struct vect > { > struct { // to init by zeroes automatically (saves you from writing > of initialization list for default constructor) > union { > T data_[3]; > struct { > T x,y,z; // saves you from vec.x() syntax > }; > }; > };
Right but this looks to me like it puts you further in the area of undefined behaviour, but this time with no checks that the array elements are aliases for x,y,z.
> I think that's the real question. I don't think his version > will fail in any reasonable implementation, but I don't see the > point of it. Your solution is far easier to read and > understand, will result in a smaller program, and will probably > be just as fast.
Right. There are 2 questions. (1) Is the code correct? (2) Why do this?
In this type of expression, losing one execution cycle is worth spending some time on. The class is a template and therefore T could be anything form a bool to a long double and beyond in size. I am making 3 assumptions:
1) The code is (in practise) correct for various T's, compilers and OSes.
2) it is trivial for the compiler to deduce that the condition of the if statement is a compile time constant, so for an optimising compiler the code is equivalent to:
#if ( VECT_SIZE == THREE_SIZEOF_T) T * p = & x; return p[n]; #else switch (n){ case 0: return x; case 1 : return y; default: return z; } #endif
3) That the array access version of the expression is always at least as fast as and maybe faster than the switch statement.
Assuming those 3 assumptions hold I may gain something , but I appear to lose nothing though after looking at Carl Barrons version, I see the scheme may be compromised by the array bounds checking! Carls code looks cleaner, however if the first ( correctness) assumption is wrong it isnt worth considering alternative expressions to the switch AFAICS.
> Oleg wrote: > > union { > > T data_[3]; > > struct { > > T x,y,z; // saves you from vec.x() syntax > > }; >> };
> Right but this looks to me like it puts you further in the area of > undefined behaviour, but this time with no checks that the array > elements are aliases for x,y,z.
my fault. This code perfectly works under VC up to v7.1 at least, but even non-compiles with Comeau C/C++ 4.3.3 (online version) The reason is simple: anonymous unions can not contain type declarations.
looks like Standard is too restrictive, isn't it? ;-) If such a tricks were allowed life of those who do use C++ for numerics/scientific computing would become much easier. Does anybody knows if such a proposal already exists?
kwikius wrote: > kanze wrote: > [...] > > I think that's the real question. I don't think his version > > will fail in any reasonable implementation, but I don't see > > the point of it. Your solution is far easier to read and > > understand, will result in a smaller program, and will > > probably be just as fast. > Right. There are 2 questions. (1) Is the code correct? (2) Why > do this?
The answer to 2 is simply no. But I think you know that; what you mean to ask is will it work anyway?
> In this type of expression, losing one execution cycle is > worth spending some time on. The class is a template and > therefore T could be anything form a bool to a long double and > beyond in size. I am making 3 assumptions: > 1) The code is (in practise) correct for various T's, > compilers and OSes.
You mean that it will probably work on most implementations.
> 2) it is trivial for the compiler to deduce that the > condition of the if statement is a compile time constant, so > for an optimising compiler the code is equivalent to: > #if ( VECT_SIZE == THREE_SIZEOF_T) > T * p = & x; > return p[n]; > #else > switch (n){ > case 0: > return x; > case 1 : > return y; > default: > return z; > } > #endif
That's probable. Again, however, the question is why? If using an array is the appropriate solution, then just use an array. If not, then use the switch.
> 3) That the array access version of the expression is always > at least as fast as and maybe faster than the switch > statement.
I'd run some benchmarks before I assumed such. About the only required extra cost you might have in the switch is bounds checking (which is required by the standard in a switch). The array access is probably faster on most modern general purpose machines, but I'd be sceptical in other cases.
Note too that as you originally wrote it, a clever compiler could easily detect that p always pointed to a single element, which means undefined behavior if n is not 0, which means that the optimizer can suppose that n always is 0 (in the array branch, at least), and replace the "return p[n]" with simply "return x" (and of course, elimate all of the other code in the function).
> Assuming those 3 assumptions hold I may gain something , but I > appear to lose nothing though after looking at Carl Barrons > version, I see the scheme may be compromised by the array > bounds checking! Carls code looks cleaner, however if the > first ( correctness) assumption is wrong it isnt worth > considering alternative expressions to the switch AFAICS.
Well, if performance really is that critical, it's probably worth implementing both versions and benchmarking them on various target machines. However, I would use two versions of the data declaration as well -- array access with the data declared as an array, and struct element access with the data declared as a struct with three elements. Mixing them will certainly confuse the reader, and in some cases, may confuse the compiler as well.
-- James Kanze GABI Software Conseils en informatique orientée objet/ Beratung in objektorientierter Datenverarbeitung 9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
kanze wrote: > Well, if performance really is that critical, it's probably > worth implementing both versions and benchmarking them on > various target machines.
These vects are intended to represent points in a scene which typically comprises structures and arrays of large numbers of points. Performance is important as the vect class is little more than a convenient point container. Its main role is to give up it members.
> However, I would use two versions of > the data declaration as well -- array access with the data > declared as an array, and struct element access with the data > declared as a struct with three elements. Mixing them will > certainly confuse the reader, and in some cases, may confuse the > compiler as well.
Having used a 3D vect both as an array and with x,y,z members I disagree. ( I dont think having 2 different classes for the same entity is a good solution at all ! ). I prefer to use the x,y,z notation when its convenient and the array notation when that is more convenient.( The array notation also relieves writing boiler plate code in functions and numerous functions as shown in the example at the end --->
IIRC the same issue (allowing use of array access operations) came up in relation to std::complex either on this list or comp.std.c++, though I cant find the references at the moment.
As an example of the superiority of the array access approach in some situations, imagine a function to detect whether two points are touching each other within a tolerance vector. Further assume that you have two implement this for a 2Dand a 3D space. The array notation is much more convenient. (I will leave the member version to do):
// type_trait to return the number of elements in a vect template<typename T> struct extent;
// Does vect a, touch vect b within the given tolerance? // Vect may be a 2d or 3d vector template <typename Vect> bool touches( Vect const & a, Vect const & b, Vect const & tolerance) { using boost::extent; for (int i = 0; i < extent<Vect>::value; ++i){ using std::abs; if ( abs( a[i] - b[i]) > abs(tolerance[i]) ){ return false; } } return true;
I don't like the union. It blocks custom scalar class to be used as T. I'd just bite it and use array member:
T v[3];
It's not that bad to use [] to access the vector member data. It's safe, portable, efficient.. the only downside is that cannot access the members easily by name using familiar .x, .y, .z syntax (common with legacy 3d graphics code).
In real world applications written for D3D or OpenGL the array access is not much of a problem, typical usage patterns are of form:
object.bla = float3(x,y,z); glVertex3f(foo.bar);
Et cetera ad nauseum.. The switch() is just not acceptable for code that isn't desirable to be mediocre.
The array memory layout also gives opportunities to use loop unrolling with templates (template meta programming techniques).
It's a step forward if the vector template is more along the lines of:
template <typename scalar, typename size> class vector { ... };
If the size is compile-time constant, it is highly inefficient to use for() or other loop construct, not to mention switch to generate the code. So template loop unrolling is a nice helper, and that benefits from array memory layout in practise (*this means with compilers that are actually commonly used, ofcourse that is OFF TOPIC in clc++).
Example of loop unrolling:
template <typename scalar, int size> inline vector<scalar,size> operator + (const vector<scalar,size>& a, const vector<scalar,size>& b) { vector<scalar,size> v; repeat< size,eval_add<scalar> >::exec(v,a,b); return v;
}
Alternative implementation, this is slightly more complicated as there are many specialized versions of the operator +, here is the trivial one:
template <typename scalar, int size> inline expr< vector<scalar,size>, scalar,size, eval_add<scalar>, vector<scalar,size>, vector<scalar,size> > operator + (const vector<scalar,size>& a, const vector<scalar,size>& b) { return expr< vector<scalar,size>,scalar,size,eval_add<scalar>, vector<scalar,size>,vector<scalar,size> >(a,b);
}
A little bit more complicated version, when adding two *expressions*, which are evaluated at compilation time:
Whee. The result of both implementations is *expression* type, which can be converted back to vector<scalar,size> object when assigned to instance of that class:
Same thing with copy constructor and similiar accessories. FWIW, the first trivial implementation using only loop unrolling is in practise good enough. It's a lot less code and easier to decipher what's going on without extensive documentation. But the bottom line is that the API is easy to follow:
foo::float3 v = a + b * 1.2f - foo::float3(x,y,z);
The array access is the way to go if performance is the goal. The union should be ditched if want to use something like foo::xfloat(64,1024), where the foo::xfloat is extended-precision floating point class.
>> template <typename T> >> struct vect >> { >> struct { // to init by zeroes automatically (saves you from writing >> of initialization list for default constructor) >> union { >> T data_[3]; >> struct { >> T x,y,z; // saves you from vec.x() syntax >> }; >> }; >> };
> Right but this looks to me like it puts you further in the > area of undefined behaviour, but this time with no checks that > the array elements are aliases for x,y,z.
Undefined behavior either is or isn't -- there is no "further". Technically, all of the solutions have undefined behavior. I think that at least g++ sort of guarantees this one, however, as an extension. Other compilers may do the same. Still other compilers may guarantee playing around with pointers, but not this. At there has been at least one compiler (CenterLine) where I think that they would all fail -- it used fat pointers and bounds checking, so the pointer solution would certainly fail, and it wouldn't supprise me if they also maintained information concerning the last member assigned to in a union, and aborted the program is any other member was accessed.
-- James Kanze kanze.ja...@neuf.fr Conseils en informatique orientée objet/ Beratung in objektorientierter Datenverarbeitung 9 place Sémard, 78210 St.-Cyr-l'École, France +33 (0)1 30 23 00 34
kwikius wrote: > I would like to use an array index operator in a 3D vector class, as an > alternative to accessing its x, y, z members directly, as I dont like > the function syntax as used in std::complex(e.g complex.real() , > vect.x() etc).
> code snip
Someone posted this on a game development forum quite a few months ago. It was shown that for VS 7.1, the calls were inlined, and the size of the vector class was unaffected (i.e. it still remained sizeof(T) * 4).
template< typename T > struct Vector4 {
typedef size_t size_type;
private:
typedef T Vector4<T>::* const vec[4];
static const vec v;
public:
T x, y, z, w;
Vector4(T _x = 0, T _y = 0, T _z = 0, T _w = 0): x(_x), y(_y), z(_z), w(_w) {}
1. The OpenGL API call would be ::glVertex3fv(...) 2. The operator = was incorrect as example, that was a plain scalar constructor (setting all components of the vector to the specified value), the correct operator = is as follows:
The loop unrolling in this case does iterate over all indices and for each scalar evaluates the expression tree that has been stored in the type F.
The type F is: expr< vector<scalar,size>, scalar, size, E0, E1, E2>
It is more readable to express like this:
for ( int i=0; i<size; ++i ) v[i] = tree.evaluate(i);
Put differently, x, y, z, w (or beyond those) are evaluated independently. There is a benefit to this arrangement, typically when we have something along the lines of:
a + b + c + ...;
Each + operator creates a temporary object, which is passed to the next + operator and so on. This inefficiency is eliminated when the created object is a type, and that the resulting type is evaluated at assignment to the real scalar object. Hence, the arrangement of code above.
This works great for vectors. For something like 4x4 matrix multiplication it would be a gross negligence write code like this, because matrix multiplication relies heavily on the intermediate result from previous multiplication being cached (in the temporary object that is created, so in some cases it works in our advantage).
This thread also got me thinking. I wanted to specialize the constructor for different dimensions, my solution to this problem was as follows:
We assume that "tuple" provides [] access, operator scalar* and the storage for the member data, scalar v[size]; ..
Assuming we have a typedefs for this, we can write:
float4 a(0,0,0,1); float3 b(0,0,1);
But this will give compile time error (just what we want):
float4 v(0,0,1); // problem: incorrect number of initializers
Prior to the "thinking" provoked by this thread, I used to just specialize the whole class! The implementation was included with preprocessor from separate header file. :()
Yikes.. the only difference between specializations for 2,3 and 4 were the constructors being different. I like the new arrangement better, obviously. =)
I was also thinking that the pointer would expose aliasing problems (performance wise) for the optimizer but some compilers I tried with seem to do good enough job with the code for that not to be of concern. I am just speculating but it could be because the source for the pointer is fixed-size array nicely visible from the scope the code is invoked from.. could be dead wrong and would prefer to be corrected if someone has solid facts).
James Kanze wrote: > kwikius wrote: > > Oleg wrote: > > [...] > >> template <typename T> > >> struct vect > >> { > >> struct { // to init by zeroes automatically (saves you from writing > >> of initialization list for default constructor) > >> union { > >> T data_[3]; > >> struct { > >> T x,y,z; // saves you from vec.x() syntax > >> }; > >> }; > >> };
> > Right but this looks to me like it puts you further in the > > area of undefined behaviour, but this time with no checks that > > the array elements are aliases for x,y,z.
> Undefined behavior either is or isn't -- there is no "further". > Technically, all of the solutions have undefined behavior.
True, but how about:
template <typename T> struct vect { T data_[3]; T &x() { return data_[0]; } T &y() { return data_[1]; } T &z() { return data_[2]; } T x() const { return data_[0]; } T y() const { return data_[1]; } T z() const { return data_[2]; } };
// Repeating the original test program, // with the EXTREMELY MINOR syntax change int main() { vect<double> v; v[0] = 2; double n = v.x(); // Need parens after x assert(v[0] == n); }
Not the letter of the original request, but certainly the spirit.
Note that there are some compilers (i.e. Microsoft) that support a language extension called Properties, which would even allow you to use the original main() unchanged... but that wouldn't be standard C++.