Let's assume that a and b have the same basic data type (e.g. int, long, float, double ...).
On many platforms it is possible to replace the comparisions a==b a!=b by memcmp(&a,&b,sizeof(a))==0 memcmp(&a,&b,sizeof(a))
What are the conditions that this will not yield the desired results?
Can someone give (an) implementation example(s), where the reasons for this behaviour can be seen?
The background is the wish of fast and easy comparison of multicomponent structures (within a perfect hash system):
typedef struct test { char ... int ... float ... ... } TEST;
TEST sa,sb; memset(&sa,'\0',sizeof(TEST)); /* or TestClear(&sa); */ memset(&sb,'\0',sizeof(TEST)); /* or TestClear(&sb); */ ... any standard component assignment. /* or TestSetComponent(&sa,comp) */ ... memcmp(&sa,&sb,sizeof(TEST));
Is it possible to construct such a general system to work in a portable way on all platforms?
Would it be possible to construct an #if-expression
On Mon, 28 Jun 1999 10:56:20 +0200, Helmut Leitner
<leit...@hls.via.at> wrote: >Let's assume that a and b have the same basic data type >(e.g. int, long, float, double ...).
>On many platforms it is possible to replace the >comparisions > a==b > a!=b >by > memcmp(&a,&b,sizeof(a))==0 > memcmp(&a,&b,sizeof(a))
If a and b are of the same basic data type then you wouldn't do this. It's ugly and probably slower. Besides: what if a and/or b have the storage class specifier "register"? Then it is illegal to use the address-of operator on them.
>What are the conditions that this will not yield the >desired results?
If they are not of the same basic type for example. Unlike using the equality operators, the memcmp version will not convert them. This might result in inequality while the values of a and b are the same.
It might look as if the memcmp version can do "clever" comparisons since it would also accept derived types like structures but even if a and b are of the same derived type you would still run the risk of inequality in the case of inequal values in the padding (if any). Thus, inequality might even appear for structures that are of the same type and have the same values for their members.
>The background is the wish of fast and easy comparison >of multicomponent structures (within a perfect hash system):
> typedef struct test { > char ... > int ... > float ... > ... > } TEST;
> TEST sa,sb; > memset(&sa,'\0',sizeof(TEST)); /* or TestClear(&sa); */ > memset(&sb,'\0',sizeof(TEST)); /* or TestClear(&sb); */ > ... > any standard component assignment. /* or TestSetComponent(&sa,comp) */ > ... > memcmp(&sa,&sb,sizeof(TEST));
Yes, you first set all char's of the structures to zero. This solves the problem of the "uninitialized padding".
>Is it possible to construct such a general system to work >in a portable way on all platforms?
In this particular case I don't see why not.
>Would it be possible to construct an #if-expression
I don't see how the expression should look like. Your solution works as long as you take care to initialize _all_ char's of the structures to avoid the "inequal padding value" problem and make sure that your a and b are of the same structure type.
In article <377738B4.97C7...@hls.via.at>, Helmut Leitner
<leit...@hls.via.at> wrote: > Let's assume that a and b have the same basic data type > (e.g. int, long, float, double ...).
> On many platforms it is possible to replace the > comparisions > a==b > a!=b > by > memcmp(&a,&b,sizeof(a))==0 > memcmp(&a,&b,sizeof(a))
Quite often it does not work with IEEE standard floating point numbers. Especially with +0.0 and -0.0 (they compare equal, but have different bit patterns) and NaN's (they compare different, even if they are identical).
There have been machines with long double = 80 bit + 16 unused bits, for example the Motorola 68040 and 68020/30 with coprocessor. On these machines you cannot rely on memcpy at all.
And for structs and unions you will always have a problem because of padding.
Apart from this, it will work on most machines :-(
: Let's assume that a and b have the same basic data type : (e.g. int, long, float, double ...).
: On many platforms it is possible to replace the : comparisions : a==b : a!=b : by : memcmp(&a,&b,sizeof(a))==0 : memcmp(&a,&b,sizeof(a))
: What are the conditions that this will not yield the : desired results?
: Can someone give (an) implementation example(s), where : the reasons for this behaviour can be seen?
Sure. Recall the strange segment:offset scheme of addressing on the 80x86 real mode, where the actual address is computed as (segment * 16 + offset). Thus, many segment:offset pairs actually compute to the same address, and the equality operator for pointers must handle these. Typically, they normalize the pointer before comparison.
Using memcmp() will likely produce false negatives.
: The background is the wish of fast and easy comparison : of multicomponent structures (within a perfect hash system): [...] : Is it possible to construct such a general system to work : in a portable way on all platforms?
In general, whenever more than one representation is possible for a single value, your proposal will fail. If you use a structure, you should also worry about the value of padding bytes, which probably means a mandatory memset() after any malloc().
: Would it be possible to construct an #if-expression
<< Is it possible to construct such a general system to work in a portable way on all platforms? >>
No, this is not possible. There are any number of reasons an entire structure's components will not compare successfully, even though each of the data components are comparable. Some have to do with differences between platforms, some have to do with differences between structures on a single platform.
Finally, you will try to compare a double and a float, arguing that they are really the same number. If you compare them directly, you will get the expected results. If you use memcmp(), you will not.
Helmut Leitner wrote in message <377738B4.97C7...@hls.via.at>... >Let's assume that a and b have the same basic data type >(e.g. int, long, float, double ...).
>On many platforms it is possible to replace the >comparisions > a==b > a!=b >by > memcmp(&a,&b,sizeof(a))==0 > memcmp(&a,&b,sizeof(a))
>What are the conditions that this will not yield the >desired results?
>Can someone give (an) implementation example(s), where >the reasons for this behaviour can be seen?
>The background is the wish of fast and easy comparison >of multicomponent structures (within a perfect hash system):
> typedef struct test { > char ... > int ... > float ... > ... > } TEST;
> TEST sa,sb; > memset(&sa,'\0',sizeof(TEST)); /* or TestClear(&sa); */ > memset(&sb,'\0',sizeof(TEST)); /* or TestClear(&sb); */ > ... > any standard component assignment. /* or TestSetComponent(&sa,comp) */ > ... > memcmp(&sa,&sb,sizeof(TEST));
>Is it possible to construct such a general system to work >in a portable way on all platforms?
>Would it be possible to construct an #if-expression
> Finally, you will try to compare a double and a float, arguing that they are > really the same number. If you compare them directly, you will get the > expected results. If you use memcmp(), you will not.
It's not clear to me what "compare them directly" means. Are you saying you can compare a float to a double using '=='? You can't, it won't work as expected.
> > Finally, you will try to compare a double and a float, arguing that they are > > really the same number. If you compare them directly, you will get the > > expected results. If you use memcmp(), you will not.
> It's not clear to me what "compare them directly" means. Are you saying > you can compare a float to a double using '=='? You can't, it won't work > as expected.
You should not even compare a double to a double using ==. The C FAQ addresses this:
14.5: What's a good way to check for "close enough" floating-point equality?
A: Since the absolute accuracy of floating point values varies, by definition, with their magnitude, the best way of comparing two floating point values is to use an accuracy threshold which is relative to the magnitude of the numbers being compared. Rather than
double a, b; ... if(a == b) /* WRONG */
use something like
#include <math.h>
if(fabs(a - b) <= epsilon * fabs(a))
for some suitably-chosen degree of closeness epsilon (as long as a is nonzero!).
Of course you can. You may not get what you expect, for an unrelated reason having to do with the binary storage type, but the compiler will promote the float to a double before making the comparison. This is not true for memcmp(), my original point.
FigBug wrote in message ... >> Finally, you will try to compare a double and a float, arguing that they are >> really the same number. If you compare them directly, you will get the >> expected results. If you use memcmp(), you will not.
>It's not clear to me what "compare them directly" means. Are you saying >you can compare a float to a double using '=='? You can't, it won't work >as expected.
In article <Pine.SUN.3.96.990628105138.16344A-100000@valdes>, FigBug <rrab...@csc.UVic.CA> wrote:
> > Finally, you will try to compare a double and a float, arguing that they are > > really the same number. If you compare them directly, you will get the > > expected results. If you use memcmp(), you will not.
> It's not clear to me what "compare them directly" means. Are you saying > you can compare a float to a double using '=='? You can't, it won't work > as expected.
It depends on how it's expected to work. You can compare a float and a double, and of course then the float gets converted to a double. The values are compared, and == must take into account the possibility of different representations for the same value. For memcmp, objects holding the two values probably wouldn't even have the same size.
(Round off errors could also cause unexpected results, but this is true of comparisons of two doubles as well and memcmp is not going to help the situation at all. You could consider an int and a long instead without changing the point.)
-- MJSR
Sent via Deja.com http://www.deja.com/ Share what you know. Learn what you don't.
> It depends on how it's expected to work. You can compare a > float and a double, and of course then the float gets converted > to a double. The values are compared, and == must take into > account the possibility of different representations for the same > value. For memcmp, objects holding the two values probably > wouldn't even have the same size.
> (Round off errors could also cause unexpected results, > but this is true of comparisons of two doubles as well and > memcmp is not going to help the situation at all. You could > consider an int and a long instead without changing the point.)
The point I was attemping to make was, that the following is useful:
int i; long int li;
i = 1; li = 1;
if (i == li) blaa();
and the following won't do anything usefull:
float f; double d;
f = 1.1; d = 1.1;
if (f == d) blaa();
Paul's post was unclear, it seemed to claim (to me anyway) that doing the above was usefull.
In article <377738B4.97C7...@hls.via.at> leit...@hls.via.at "Helmut Leitner" writes:
>Let's assume that a and b have the same basic data type >(e.g. int, long, float, double ...).
>On many platforms it is possible to replace the >comparisions > a==b > a!=b >by > memcmp(&a,&b,sizeof(a))==0 > memcmp(&a,&b,sizeof(a))
>What are the conditions that this will not yield the >desired results?
When different bit patterns represent the same value or at least values that compare equal. There are various reasons why this might happen.
Firstly any basic type except unsigned char can have "holes" or unused bits that don't contribute to the value. So these bits can differ (hence memcmp() will return non-zero) for values that compare equal.
In 1's complement and sign-mangitde format integers 0 and -0 have different bit representations but compare equal.
In floating point formats such as IEEE-754 there can also be different representations for 0 and -0. I believe in some floating point formats there are a large number of representations for zero. Some floating point formats such as IEEE-754 also create the opposite problem, NaNs compare unquual to anything even other NaNs so it is possible for values with identical bit patterns to compare unequal.
Pointers can have different representations for the same address (e.g. in segnemted archatectures like the 8086). They can also contain other information such as boundary information (these are sometimes called fat pointers). So pointers with different bit patterns can compare equal.
Structures and unions can include padding bytes. Since padding does not contribute to the structure or union's value I don't see anything that prohibits this padding from changing arbitrarily. There can also be padding bits where bit-fields are involved.
>Can someone give (an) implementation example(s), where >the reasons for this behaviour can be seen?
>The background is the wish of fast and easy comparison >of multicomponent structures (within a perfect hash system):
> typedef struct test { > char ... > int ... > float ... > ... > } TEST;
> TEST sa,sb; > memset(&sa,'\0',sizeof(TEST)); /* or TestClear(&sa); */ > memset(&sb,'\0',sizeof(TEST)); /* or TestClear(&sb); */ > ... > any standard component assignment. /* or TestSetComponent(&sa,comp) */ > ... > memcmp(&sa,&sb,sizeof(TEST));
>Is it possible to construct such a general system to work >in a portable way on all platforms?
No.
>Would it be possible to construct an #if-expression
Probably the simplest thing to do would be to veryify the platforms you are interested in yourself and define a macro you can test for those platforms which fit the bill. However there are lots of different ways to fail.
-- ----------------------------------------- Lawrence Kirby | f...@genesis.demon.co.uk Wilts, England | 70734....@compuserve.com -----------------------------------------
>On Mon, 28 Jun 1999 10:56:20 +0200, Helmut Leitner ><leit...@hls.via.at> wrote:
>>Let's assume that a and b have the same basic data type >>(e.g. int, long, float, double ...).
>>On many platforms it is possible to replace the >>comparisions >> a==b >> a!=b >>by >> memcmp(&a,&b,sizeof(a))==0 >> memcmp(&a,&b,sizeof(a))
>If a and b are of the same basic data type then you wouldn't do this. >It's ugly and probably slower. Besides: what if a and/or b have the >storage class specifier "register"? Then it is illegal to use the >address-of operator on them.
The simple fix for that is to remove the register specifier. If you are writing the function containing the memcmpÿcall you should have control of the local variables in the function.
-- ----------------------------------------- Lawrence Kirby | f...@genesis.demon.co.uk Wilts, England | 70734....@compuserve.com -----------------------------------------
<< [about comparing a float and a double] Paul's post was unclear, it seemed to claim (to me anyway) that doing the above was usefull. >>
It *is* useful. Typically the programmer will know the float will be converted to a double, then compared to a double. Acknowledging the various possible errors not related to the problem at hand, in many cases this is the same as comparing two doubles.
And, granted the limitations of direct comparison of floating-point data types, it cannot meaningfully be replaced with the memcmp() approach.
FigBug wrote in message ... >On Mon, 28 Jun 1999 rawc...@my-deja.com wrote:
>> It depends on how it's expected to work. You can compare a >> float and a double, and of course then the float gets converted >> to a double. The values are compared, and == must take into >> account the possibility of different representations for the same >> value. For memcmp, objects holding the two values probably >> wouldn't even have the same size.
>> (Round off errors could also cause unexpected results, >> but this is true of comparisons of two doubles as well and >> memcmp is not going to help the situation at all. You could >> consider an int and a long instead without changing the point.)
>The point I was attemping to make was, that the following is useful:
>int i; >long int li;
>i = 1; >li = 1;
>if (i == li) blaa();
>and the following won't do anything usefull:
>float f; >double d;
>f = 1.1; >d = 1.1;
>if (f == d) blaa();
>Paul's post was unclear, it seemed to claim (to me anyway) that doing the >above was usefull.
> << [about comparing a float and a double] Paul's post was unclear, it seemed > to claim (to me anyway) that doing the above was usefull. >>
> It *is* useful. Typically the programmer will know the float will be > converted to a double, then compared to a double. Acknowledging the various > possible errors not related to the problem at hand, in many cases this is > the same as comparing two doubles.
Quite right, a horrible gaffe (comparing doubles for equality), and indicates that the programmer is probably incompetent for numerical work.
> And, granted the limitations of direct comparison of floating-point data > types, it cannot meaningfully be replaced with the memcmp() approach.
In article <EhNd3.96085$_m4.1006...@news2.giganews.com> nos...@nosite.com "Paul Lutus" writes:
...
>Finally, you will try to compare a double and a float, arguing that they are >really the same number. If you compare them directly, you will get the >expected results. If you use memcmp(), you will not.
This isn't a problem since...
>Helmut Leitner wrote in message <377738B4.97C7...@hls.via.at>... >>Let's assume that a and b have the same basic data type
-- ----------------------------------------- Lawrence Kirby | f...@genesis.demon.co.uk Wilts, England | 70734....@compuserve.com -----------------------------------------
> << Is it possible to construct such a general system to work in a portable > way on all platforms? >>
> No, this is not possible. > There are any number of reasons an entire structure's > components will not compare successfully > even though each of the data components are comparable.
Even if the whole data structure is "initialized" in the best possible way (to be determined)?
> Some have to do with differences between platforms,
This is a bit cryptic to me. If there is no general, portable way to make memcmp() work, then of course there must be differences between platforms. But your statement doesn't tell anything...
> some have to do with differences between structures on a single platform.
If we define struct test sa,sb; and initialize them properly (to be determined) TestClear(&sa); TestClear(&sb); what can be the differences between these structures.
> Finally, you will try to compare a double and a float
Look at my first sentence (the basic assumption): "Let's assume that a and b have the same basic data type" This clearly excludes comparing pears and apples.
<< Even if the whole data structure is "initialized" in the best possible way (to be determined)? >>
Yes. Consider otherwise acceptable data types for C comparisons, such as a short and a long, or a float and a double (floating-point errors aside). This is apart from any platform-related issues.
<< Look at my first sentence (the basic assumption): "Let's assume that a and b have the same basic data type" This clearly excludes comparing pears and apples. >>
Not at all, and complying C compilers have no objection. Example:
#include <stdio.h>
struct testA { short orange; long apple;
};
struct testB { long orange; short apple;
};
struct testC { float orange; double apple;
};
struct testD { double orange; float apple;
};
void testPrint(char *prompt, int v) { printf("%s: %s\n",prompt,(v)?"True":"False");
}
int main() { struct testA w = {333,333}; struct testB x = {333,333};
struct testC y = {333.125,333.125}; struct testD z = {333.125,333.125};
Helmut Leitner wrote in message <37786F93.D74F...@hls.via.at>...
>Paul Lutus wrote:
>> << Is it possible to construct such a general system to work in a portable >> way on all platforms? >>
>> No, this is not possible.
>> There are any number of reasons an entire structure's >> components will not compare successfully >> even though each of the data components are comparable.
>Even if the whole data structure is "initialized" in >the best possible way (to be determined)?
>> Some have to do with differences between platforms,
>This is a bit cryptic to me. If there is no general, portable >way to make memcmp() work, then of course there must be differences >between platforms. But your statement doesn't tell anything...
>> some have to do with differences between structures on a single platform.
>If we define > struct test sa,sb; >and initialize them properly (to be determined) > TestClear(&sa); > TestClear(&sb); >what can be the differences between these structures.
>> Finally, you will try to compare a double and a float
>Look at my first sentence (the basic assumption): > "Let's assume that a and b have the same basic data type" >This clearly excludes comparing pears and apples.
> << Look at my first sentence (the basic assumption): "Let's assume that a > and b have the same basic data type" This clearly excludes comparing pears > and apples. >>
> Not at all, and complying C compilers have no objection. Example:
> #include <stdio.h>
> struct testA { > short orange; > long apple; > };
> struct testB { > long orange; > short apple; > };
> Q.E.D. -- memcmp() is asking for trouble, even on the same platform.
Paul - you are twisting the sense of the thread right out of shape.
The original question was:
"Let's assume that a and b have the same basic data type (e.g. int, long, float, double ...). On many platforms it is possible to replace the comparisions a==b, a!=b by memcmp(&a,&b,sizeof(a))==0, memcmp(&a,&b,sizeof(a)) - What are the conditions that this will not yield the desired results?"
Now, there is no way that you can equate y.orange == z.orange with y == z without twisting reality somewhat. (Comparison of structs using == is of course not legal C.) And since by your definition y.orange is a float and z.orange is a double, you are off-beam there too, because a float is not a double.
You have set up a straw man and you are now giving him a good pasting. But this does not contribute in any way to answering the original question.
It pains me to see a usually logically-minded person "losing it" in this fashion. /Please/ read before you think before you post.
(This whole reply would have gone to email if only I had an email address to send it to. I'm tempted to set one up for you myself.)
>>On Mon, 28 Jun 1999 10:56:20 +0200, Helmut Leitner >><leit...@hls.via.at> wrote:
>>>Let's assume that a and b have the same basic data type >>>(e.g. int, long, float, double ...).
>>>On many platforms it is possible to replace the >>>comparisions >>> a==b >>> a!=b >>>by >>> memcmp(&a,&b,sizeof(a))==0 >>> memcmp(&a,&b,sizeof(a))
>>If a and b are of the same basic data type then you wouldn't do this. >>It's ugly and probably slower. Besides: what if a and/or b have the >>storage class specifier "register"? Then it is illegal to use the >>address-of operator on them.
>The simple fix for that is to remove the register specifier. If you >are writing the function containing the memcmpycall you should have >control of the local variables in the function.
If you would want to do that, sure. But the question was about how equal the memcmp version was to the equality operator version with a and b being of the same basic data type.
I've seen interesting incompatibilities posted which I didn't think of myself (how a pointer is stored, the floats (I never do floats) and particular bit representation schemes for negative numbers) but mine is the only one that results in an error :-)
> In article <377738B4.97C7...@hls.via.at> > leit...@hls.via.at "Helmut Leitner" writes:
> >Let's assume that a and b have the same basic data type > >(e.g. int, long, float, double ...).
> >On many platforms it is possible to replace the > >comparisions > > a==b > > a!=b > >by > > memcmp(&a,&b,sizeof(a))==0 > > memcmp(&a,&b,sizeof(a))
> >What are the conditions that this will not yield the > >desired results?
> When different bit patterns represent the same value or at least > values that compare equal. There are various reasons why this might > happen.
> Firstly any basic type except unsigned char can have "holes" or unused
> bits that don't contribute to the value.
That's what I try to *really* understand. At the moment (with your help and the other's contributions) I see five different problems.
A. Clearing the structures including all bits and bytes that may not be directly used for component content.
Solving this problem would result in a StructClear(void *p,size_t size); My naive implementation for this was memset(p,'\0',size);
Might it be possible that this does not reach all bits in memory? Could a fp-only implementation (say 96 bit cpu) use 64 mantissa-bits for char/int/long/... ? So that the memset above would not delete all bits used for fp data representation? Might it be possible in such cases to switch to a different basic data type for the clearing process? Can such a data type be selected in a portable way?
B. Assignment to structure components without using ambiguous bit representations.
Solving this problem would result in a set of assignment-functions:
void IntSetVal(int *pi,int val) { if(val==0) { *pi=0; /* to avoid the +0/-0 problem */ } else { *pi=val; } }
void StrSizeSetVal(char *d,size_t size,char *s) /* usable only for "flat" string components */ { memset(d,'0',size); strcpy(d,s); /* no safety intended */ }
void DoubleSetVal(double *pd,double val) { switch(fpclassify(val)) { case FP_INFINITE: *pd=INFINITE; break; case FP_NAN: *pd=NAN; break; case FP_ZERO: *pd=0.0; break; case FP_SUBNORMAL: *pd=val; /* no clue, what to do here break; case FP_NORMAL: default: *pd=val; break; } }
and similar functions: void LongSetVal(...) void FloatSetVal(...) ....
This leaves the problem of ambiguous pointer representation open. Although for some cases (DOS segmented pointer) it might be easy to normalize the pointers, I don't know what problems could arise on other platforms.
C. Comparing the data structures sa and sb without leaving out bits that are used by some component data representation.
Solving this problem would result in a reliable struct-memory-compare functions.
int StructCmp(void *psa,void *psb,size_t size) { int ret=memcmp(psa,psb,size); if(ret==0) { /* what code needed here ??? */ /* absurd, only to fill the void: */ while(size>sizeof(double) && ret==0) { ret= (*(double *)psa != *(double *)psb) size-=sizeof(double); psa+=sizeof(double); psb+=sizeof(double); } } return(ret); }
My naive implementation for this was memcmp(psa,psb,sizeof(*psa));
D. Similar to C, the problem of getting at all bits that carry component information e.g. for calulating a hash value. Will be solved when C ist solved.
E. May the system touch/change unused padding bytes within a structure at will?
> So these bits can differ (hence > memcmp() will return non-zero) for values that compare equal.
> In 1's complement and sign-mangitde format integers 0 and -0 have > different bit representations but compare equal.
I tried to account for this in the assignment function.
> In floating point formats such as IEEE-754 there can also be different > representations for 0 and -0. I believe in some floating point > formats there are a large number of representations for zero. Some > floating point formats such as IEEE-754 also create the opposite > problem, NaNs compare unquual to anything even other NaNs so it is possible > for values with identical bit patterns to compare unequal.
I tried to simplify the problem by reducing it. In a real system I woulkd not allow NaNs to creep in, so this problem would not give me bad dreams.
> Pointers can have different representations for the same address (e.g. > in segnemted archatectures like the 8086). They can also contain other > information such as boundary information (these are sometimes called fat > pointers). So pointers with different bit patterns can compare equal.
Has this to do with access rights?
Would stripping these "fat pointer bits" be possible?
Could we assume, that after char PiStr[]="3.14159"; ... char *p=PiStr; ... /* maybe different module */ char *q=PiStr; the pointers p and q have the same bit representation?
> Structures and unions can include padding bytes. Since padding does not > contribute to the structure or union's value I don't see anything that > prohibits this padding from changing arbitrarily. There can also be > padding bits where bit-fields are involved.
I try to clear all these bits and bytes and hope/expect that "the system" will not quietly change them. Is this unreasonable?
> Lawrence Kirby wrote: > > >What are the conditions that this will not yield the > > >desired results?
> > When different bit patterns represent the same value or at least > > values that compare equal. There are various reasons why this might > > happen.
> > Firstly any basic type except unsigned char can have "holes" or unused > > bits that don't contribute to the value.
> That's what I try to *really* understand. > At the moment (with your help and the other's contributions) > I see five different problems.
> A. Clearing the structures including all bits and bytes that > may not be directly used for component content.
> Solving this problem would result in a > StructClear(void *p,size_t size); > My naive implementation for this was > memset(p,'\0',size);
> Might it be possible that this does not reach all bits > in memory?
No, memset looks at an object as an array of unsigned chars; since there are no holes or paddings in unsigned char, memset reaches all bits.
> B. Assignment to structure components without using > ambiguous bit representations.
> Solving this problem would result in a set of > assignment-functions:
> void IntSetVal(int *pi,int val) > { > if(val==0) { > *pi=0; /* to avoid the +0/-0 problem */ > } else { > *pi=val; > } > }
> void StrSizeSetVal(char *d,size_t size,char *s) > /* usable only for "flat" string components */ > { > memset(d,'0',size); > strcpy(d,s); /* no safety intended */ > }
> void DoubleSetVal(double *pd,double val) > { > switch(fpclassify(val)) { > case FP_INFINITE: > *pd=INFINITE; > break; > case FP_NAN: > *pd=NAN; > break; > case FP_ZERO: > *pd=0.0; > break; > case FP_SUBNORMAL: > *pd=val; /* no clue, what to do here > break; > case FP_NORMAL: default: > *pd=val; > break; > } > }
> and similar functions: > void LongSetVal(...) > void FloatSetVal(...) > ....
> This leaves the problem of ambiguous pointer representation > open. Although for some cases (DOS segmented pointer) it might > be easy to normalize the pointers, I don't know what problems > could arise on other platforms.
The problem is - no data type except unsigned chars is guaranteed not to have holes in its binary representation. So, for example, int a,b; a = 1; b = 1; -then memcmp(&a,&b,sizeof int) is not guaranteed to compare equal.
> C. Comparing the data structures sa and sb without leaving > out bits that are used by some component data representation.
> Solving this problem would result in a reliable > struct-memory-compare functions.
> int StructCmp(void *psa,void *psb,size_t size) > { > int ret=memcmp(psa,psb,size); > if(ret==0) { > /* what code needed here ??? */ > /* absurd, only to fill the void: */ > while(size>sizeof(double) && ret==0) { > ret= (*(double *)psa != *(double *)psb) > size-=sizeof(double); > psa+=sizeof(double); > psb+=sizeof(double); > } > } > return(ret); > }
> My naive implementation for this was > memcmp(psa,psb,sizeof(*psa));
if memcmp returns 0, stuctures *are* equal; the problem is when it returns non-zero - they still may be equal, except padding bits.
> D. Similar to C, the problem of getting at all bits that > carry component information e.g. for calulating a hash > value. Will be solved when C ist solved.
> E. May the system touch/change unused padding bytes within > a structure at will?
> > So these bits can differ (hence > > memcmp() will return non-zero) for values that compare equal.
> > In 1's complement and sign-mangitde format integers 0 and -0 have > > different bit representations but compare equal.
> I tried to account for this in the assignment function.
> > In floating point formats such as IEEE-754 there can also be different > > representations for 0 and -0. I believe in some floating point > > formats there are a large number of representations for zero. Some > > floating point formats such as IEEE-754 also create the opposite > > problem, NaNs compare unquual to anything even other NaNs so it is possible > > for values with identical bit patterns to compare unequal.
> I tried to simplify the problem by reducing it. > In a real system I woulkd not allow NaNs to creep in, so > this problem would not give me bad dreams.
> > Pointers can have different representations for the same address (e.g. > > in segnemted archatectures like the 8086). They can also contain other > > information such as boundary information (these are sometimes called fat > > pointers). So pointers with different bit patterns can compare equal.
> Has this to do with access rights?
> Would stripping these "fat pointer bits" be possible?
> Could we assume, that after > char PiStr[]="3.14159"; > ... > char *p=PiStr; > ... /* maybe different module */ > char *q=PiStr; > the pointers p and q have the same bit representation?
No.
> > Probably the simplest thing to do would be to veryify the platforms > > you are interested in yourself
> No problem here. It is a question of portability.
> > and define a macro you can test for > > those platforms which fit the bill. However there are lots of > > different ways to fail.
> I'll try to understand as many of them as possible.
The worst possible problem is padding bytes in basic types; bitfields also don't fit your approach. You may zero all bits of your structure/object prior to assignment - but that doesn't buy you much, since in case of holes in type representation you may (and probably shall) receive padding from the rvalue.
-- Regards, Alex Krol Disclaimer: I'm not speaking for Scitex Corporation Ltd
Sent via Deja.com http://www.deja.com/ Share what you know. Learn what you don't.
> The point I was attemping to make was, that the following is useful:
> int i; > long int li;
> i = 1; > li = 1;
> if (i == li) blaa();
> and the following won't do anything usefull:
> float f; > double d;
> f = 1.1; > d = 1.1;
> if (f == d) blaa();
> Paul's post was unclear, it seemed to claim (to me anyway) that doing the > above was usefull.
That certainly wasn't clear from your original post; it would have helped to mention long and int. Despite taking part in a long thread on == with floating point, I still don't even know for sure whether double a,b; a=1.1; b=a; implies that a==b afterward, let alone declaring a as float and leaving b as double. But I am certain that memcmp could only be worse.
-- MJSR
Sent via Deja.com http://www.deja.com/ Share what you know. Learn what you don't.
[...] : At the moment (with your help and the other's contributions) : I see five different problems. [...]
Wouldn't the solution to these five problems outweigh the cost of just writing a function to compare the two structures? If I was maintaining this program, I'd probably find the mechanism surprising.
hu...@mnsinc.com (Szu-Wen Huang) writes: > Sure. Recall the strange segment:offset scheme of addressing > on the 80x86 real mode, where the actual address is computed > as (segment * 16 + offset). Thus, many segment:offset pairs > actually compute to the same address, and the equality operator > for pointers must handle these. Typically, they normalize the > pointer before comparison.
Not necessarily. Under 8086, all "chunks" of memory have to sit within a 64k segment.
Imagine an arbitary array of 10 ints, at 0x1234:0x0004. When moving a pointer about this array, only the offset needs changing. a[8] can be found at 0x1234:0x0014 (assuming 2 byte ints). It's convient to calcuate the offset part and leave the segment well alone.
Later on, somehow the pointer 0x1233:0x0024 came along, and the code compared == this pointer with a[8]. Using a simple bit compare, a mis-match would be reported, even though it points to the same adress.
But wait! How did we get a pointer with value 0x1233:0x0024 in the first place?
Underflowing or overflowing an array subscript? Bzzz. Undefined behaviour. Pointer arthmetic outside an object's bounds? Bzzz. Undefined behaviour. (int*)(0x12330024) ? Implementation defined.
Point is, a compiler targeting 8086 can get away with not normalising a pointer, so long as it does not do any normalising anywhere else. (Documenting the effect of comparing casted pointers notwithstanding.) If just pointer gets normalised, then all pointers have to be normalised.
Anyway, I may have missed a legal way to get two representations of a single pointer under my scheme, so look for any follow-ups before responding.