Google Mail Calendar Documents Reader Web more »
Recently Visited Groups | Help | Sign in
Google Groups Home
Switch statement code generation
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  17 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Follow-up To:
Add Cc | Add Follow-up to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers that you hear
 
Joshua Cranmer  
View profile   Translate to Translated (View Original)
 More options 3 Nov, 13:01
Newsgroups: comp.compilers
From: Joshua Cranmer <Pidgeo...@verizon.invalid>
Date: Tue, 03 Nov 2009 08:01:56 -0500
Local: Tues 3 Nov 2009 13:01
Subject: Switch statement code generation
I'm trying to put together a list of various methods to do code generate
for switch statements.

This is what I have so far:
* Successive if/else branches
* Jump tables
* branch tables with linear and binary scans
* Clustering tables (e.g., if you have cases 0-100 and 300-400).

Are there any other methods worth mentioning?

--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth
[I think I've seen hashing into branch tables. -John]


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
glen herrmannsfeldt  
View profile   Translate to Translated (View Original)
 More options 4 Nov, 05:27
Newsgroups: comp.compilers
From: glen herrmannsfeldt <g...@ugcs.caltech.edu>
Date: Wed, 4 Nov 2009 05:27:45 +0000 (UTC)
Local: Wed 4 Nov 2009 05:27
Subject: Re: Switch statement code generation

Joshua Cranmer <Pidgeo...@verizon.invalid> wrote:
> I'm trying to put together a list of various methods to do code generate
> for switch statements.
> This is what I have so far:
> * Successive if/else branches
> * Jump tables
> * branch tables with linear and binary scans
> * Clustering tables (e.g., if you have cases 0-100 and 300-400).

The two I have seen done on S/370 are an indexed branch to a table of
branch instructions, and indexed load of an address to a register, and
then BR to that address.  I presume those are two of the above.  (IBM
programs commonly have return codes that are multiples of four,
presumably to make the indexing easier.)

> Are there any other methods worth mentioning?

VAX has the CASE instruction, most likely a microcode implementation
of one of the above.  When VAX was new, there were stories that many
of the special instructions (such as POLY and INDEX) were slower than
doing the same operation using separate instructions.

I believe that JVM also has a special instruction for this
operation, as the above likely don't work there.

> [I think I've seen hashing into branch tables. -John]

Sounds better than branching into hash tables.

-- glen
[On machines with condition codes, I've seen binary searches expanded
into code with compare and branch instructions; after the comparison
you can do a three way branch on less, equal, or greater. -John]


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Kaz Kylheku  
View profile   Translate to Translated (View Original)
 More options 4 Nov, 06:08
Newsgroups: comp.compilers
From: Kaz Kylheku <kkylh...@gmail.com>
Date: Wed, 4 Nov 2009 06:08:48 +0000 (UTC)
Local: Wed 4 Nov 2009 06:08
Subject: Re: Switch statement code generation
On 2009-11-03, Joshua Cranmer <Pidgeo...@verizon.invalid> wrote:

> I'm trying to put together a list of various methods to do code generate
> for switch statements.

> This is what I have so far:
> * Successive if/else branches
> * Jump tables
> * branch tables with linear and binary scans
> * Clustering tables (e.g., if you have cases 0-100 and 300-400).

> Are there any other methods worth mentioning?

if/else arranged in a binary search tree, rather than linear succession.
Analogous to your branch table with binary scan, but using code
rather than a table to reduce the comparions for N labels to log N.

E.g. cases are:

10 50 90 1000 1200 3000 4000 7500 12345

  if (x < 3000) {
     if (x < 90) {
       if (x == 10) {
          ...
       } else if (x == 50) {
          ...
       }
     } else {
       if (x == 90) {
         ...
       } else if (x == 1000) {
         ...
       } else if (x == 1200) {
         ...
       }
     }
  } else {
    /* similarly for 3000-12345 range */
  }

This can combine with clustering.  Binary search to classify an input
value into cluster ranges, then check the range and use the table for that
range, or default out.


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
BGB / cr88192  
View profile   Translate to Translated (View Original)
 More options 4 Nov, 07:13
Newsgroups: comp.compilers
From: "BGB / cr88192" <cr88...@hotmail.com>
Date: Wed, 4 Nov 2009 00:13:39 -0700
Local: Wed 4 Nov 2009 07:13
Subject: Re: Switch statement code generation

"glen herrmannsfeldt" <g...@ugcs.caltech.edu> wrote in message
> Joshua Cranmer <Pidgeo...@verizon.invalid> wrote:
>> I'm trying to put together a list of various methods to do code generate
>> for switch statements.

<snip>

> I believe that JVM also has a special instruction for this
> operation, as the above likely don't work there.

I think both the JVM and MSIL/CIL have special instructions for jump
tables...

>> [I think I've seen hashing into branch tables. -John]

> Sounds better than branching into hash tables.

> -- glen
> [On machines with condition codes, I've seen binary searches expanded
> into code with compare and branch instructions; after the comparison
> you can do a three way branch on less, equal, or greater. -John]

yeah...

in my compiler (on x86 and x64) I ended up using condition codes to
implement a trinary jump of this sort.

I had intended this for implementing slightly faster switches (errm...
because my compiler does not have jump tables... lame, I know, just never
got around to it...).

from what I remember, I do a binary lookup to handle switches.

then again, I am still not using my compiler as a "general purpose" C
compiler, and so there are still a few deficiencies I have never really
gotten around to fixing.

so many other things to work on, such as the x86 interpreter (now works ok,
and is ~170x slower than native, but still lacks a good deal in terms of API
bindings, me left thinking I may need some sort of IDL-like technology for
this, ...), ...

maybe just a whole lot of busywork, I don't know...


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ian Lance Taylor  
View profile   Translate to Translated (View Original)
 More options 4 Nov, 07:50
Newsgroups: comp.compilers
From: Ian Lance Taylor <i...@airs.com>
Date: Tue, 03 Nov 2009 23:50:53 -0800
Local: Wed 4 Nov 2009 07:50
Subject: Re: Switch statement code generation

Joshua Cranmer <Pidgeo...@verizon.invalid> writes:
> I'm trying to put together a list of various methods to do code generate
> for switch statements....

> Are there any other methods worth mentioning?

There are a number of different methods listed in
http://ols.fedoraproject.org/GCC/Reprints-2008/sayle-reprint.pdf .

Ian


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
glen herrmannsfeldt  
View profile   Translate to Translated (View Original)
 More options 4 Nov, 08:51
Newsgroups: comp.compilers
From: glen herrmannsfeldt <g...@ugcs.caltech.edu>
Date: Wed, 4 Nov 2009 08:51:49 +0000 (UTC)
Local: Wed 4 Nov 2009 08:51
Subject: Re: Switch statement code generation
glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:

(snip, someone wrote)

> [On machines with condition codes, I've seen binary searches expanded
> into code with compare and branch instructions; after the comparison
> you can do a three way branch on less, equal, or greater. -John]

I once saw in a Fortran book the suggestion that one use arithmetic
IF in place of computed GOTO with three labels.

      IF(N-2) 1,2,3

instead of

      GOTO (1,2,3),N

For modern processors depending on branch prediction logic,
either a branch table or address table will likely foil any
prediction system.  Complicated conditional branching is likely
faster in that case.

-- glen
[Hey, it worked great on the 709. -John]


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Anton Ertl  
View profile   Translate to Translated (View Original)
 More options 4 Nov, 10:51
Newsgroups: comp.compilers
From: an...@mips.complang.tuwien.ac.at (Anton Ertl)
Date: Wed, 04 Nov 2009 10:51:34 GMT
Local: Wed 4 Nov 2009 10:51
Subject: Re: Switch statement code generation

Joshua Cranmer <Pidgeo...@verizon.invalid> writes:
>I'm trying to put together a list of various methods to do code generate
>for switch statements.
...
>Are there any other methods worth mentioning?

Hash-table-based jump tables.

IIRC the following paper discusses these methods:

@Article{journals/spe/KannanP94,
  title =       "Short Communication: Correction to 'Producing Good
                 Code for the case Statement'",
  author =      "Sampath Kannan and Todd A. Proebsting",
  journal =     "Softw, Pract. Exper",
  year =        "1994",
  number =      "2",
  volume =      "24",
  bibdate =     "2003-11-25",
  bibsource =   "DBLP,
                 http://dblp.uni-trier.de/db/journals/spe/spe24.html#KannanP94",
  pages =       "233",

}

and this refers to:

@Article{bernstein:85a,
  author =      "R. L. Bernstein",
  title =       "Producing good code for the case statement",
  journal =     "Software -- Practice \& Experience",
  volume =      "15",
  number =      "10",
  pages =       "1021--1024",
  month =       oct,
  year =        "1985",

}

These days, with the varying amounts of branch prediction hardware,
the best way probably depends on the hardware (which branch predictors
are implemented in the hardware?) and on the run-time characteristics
of the program (how effective are the branch predictors on this
particular usage pattern).

- anton
--
M. Anton Ertl
an...@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Joshua Cranmer  
View profile   Translate to Translated (View Original)
 More options 4 Nov, 12:38
Newsgroups: comp.compilers
From: Joshua Cranmer <Pidgeo...@verizon.invalid>
Date: Wed, 04 Nov 2009 07:38:43 -0500
Local: Wed 4 Nov 2009 12:38
Subject: Re: Switch statement code generation
On 11/04/2009 12:27 AM, glen herrmannsfeldt wrote:

> I believe that JVM also has a special instruction for this
> operation, as the above likely don't work there.

Actually two special bytecodes, lookupswitch and tableswitch. These are
the linear scan and the jump table methods, respectively. The upcoming
string switch appears to use a hash routine.

> [On machines with condition codes, I've seen binary searches expanded
> into code with compare and branch instructions; after the comparison
> you can do a three way branch on less, equal, or greater. -John]

It seems both gcc and llvm emit code like this; I've added it to my list
already.

--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Walter Banks  
View profile   Translate to Translated (View Original)
 More options 5 Nov, 20:12
Newsgroups: comp.compilers
From: Walter Banks <wal...@bytecraft.com>
Date: Thu, 05 Nov 2009 15:12:30 -0500
Local: Thurs 5 Nov 2009 20:12
Subject: Re: Switch statement code generation

Joshua Cranmer wrote:
> I'm trying to put together a list of various methods to do code generate
> for switch statements.

> This is what I have so far:
> * Successive if/else branches

I have seen / have implemented  if/else as either buried as part of the
switch body or as a separate if/else code block that jumps into the
switch body. The determining factor is the distance a conditional
branch can jump.

Regards,

Walter..
--
Walter Banks
Byte Craft Limited
Tel. (519) 888-6911
http://www.bytecraft.com


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
bartc  
View profile   Translate to Translated (View Original)
 More options 5 Nov, 22:11
Newsgroups: comp.compilers
From: "bartc" <ba...@freeuk.com>
Date: Thu, 05 Nov 2009 22:11:44 GMT
Local: Thurs 5 Nov 2009 22:11
Subject: Re: Switch statement code generation
"Joshua Cranmer" <Pidgeo...@verizon.invalid> wrote in message

news:09-11-009@comp.compilers...

> On 11/04/2009 12:27 AM, glen herrmannsfeldt wrote:
>> I believe that JVM also has a special instruction for this
>> operation, as the above likely don't work there.

> Actually two special bytecodes, lookupswitch and tableswitch. These are
> the linear scan and the jump table methods, respectively. The upcoming
> string switch appears to use a hash routine.

Hey, that's what I use.. invented independently of course.

'switch' for an indexed jump table, for case values in the range of 500 or
1000 (between minimum and maximum values).
'cswitch' for other case values, which just does a linear search.

Since these implement 'Switch' for an interpreted language, with the cswitch
bytecode implemented in tight assembler, opting for a linear search doesn't
have as much impact as it would do in a compiled language, especially if the
most common cases appear first.

However, with a big enough table size then I guess a proper lookup is called
for.

--
Bartc


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
nathan.mccauley@gmail.com  
View profile   Translate to Translated (View Original)
 More options 8 Nov, 04:52
Newsgroups: comp.compilers
From: "nathan.mccau...@gmail.com" <nathan.mccau...@gmail.com>
Date: Sat, 7 Nov 2009 20:52:09 -0800 (PST)
Local: Sun 8 Nov 2009 04:52
Subject: Re: Switch statement code generation
On Nov 5, 12:12 pm, Walter Banks <wal...@bytecraft.com> wrote:

> Joshua Cranmer wrote:
> > I'm trying to put together a list of various methods to do code generate
> > for switch statements. ...

Has anyone seen tries used for the string-based switch statements?

    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Hans-Peter Diettrich  
View profile   Translate to Translated (View Original)
 More options 8 Nov, 19:09
Newsgroups: comp.compilers
From: Hans-Peter Diettrich <DrDiettri...@aol.com>
Date: Sun, 08 Nov 2009 20:09:24 +0100
Local: Sun 8 Nov 2009 19:09
Subject: Re: Switch statement code generation
nathan.mccau...@gmail.com schrieb:

>>> I'm trying to put together a list of various methods to do code generate
>>> for switch statements. ...

> Has anyone seen tries used for the string-based switch statements?

What dou you expect?

Visual Basic had such a feature, implemented as a series of string
compares and jumps.

The typical solution is a lookup in a (sorted, hashed...) string list,
followed by an "ordinary" switch with the index of the found string. The
target address could be attached to every string in that list, so that a
second switch is not required.

A string tree might give the best lookup performance.

DoDi

[If there's a lot of strings, a patricia trie might be faster than
hashing or binary search since it doesn't require repeated scans of
the string during the match process. -John]


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ray  
View profile   Translate to Translated (View Original)
 More options 9 Nov, 16:04
Newsgroups: comp.compilers
From: Ray <b...@sonic.net>
Date: Mon, 09 Nov 2009 08:04:39 -0800
Local: Mon 9 Nov 2009 16:04
Subject: Re: Switch statement code generation

nathan.mccau...@gmail.com wrote:
> On Nov 5, 12:12 pm, Walter Banks <wal...@bytecraft.com> wrote:
>> Joshua Cranmer wrote:
>> > I'm trying to put together a list of various methods to do code
>> > generate for switch statements. ...

> Has anyone seen tries used for the string-based switch statements?

No... Not tries exactly.  Since compilation happens on static data, it's
easy to build a simple balanced binary tree.  The extra complication of
tries when you're not going to be doing insertions and deletions isn't
worth it.

I think the winner for string switches is a string representation with
a precomputed hash code, used in combination with a hashtable of
jump addresses.

Because the elements are known at compilation time, you can know in
advance the number of collisions in the longest bucket when you're
allocating the table, which means you can just allocate a table of
NxM entities where N is the number of buckets (a power of 2) and M
is the longest length needed (rounded up to a power of 2).

So when you get the string's hash code, you do a bitmask to get the
index into the hash table, and then check to see if you have a matching
hash code(success) or a zero hashcode (failure) there - if not,
increment the index by one until you do (a degenerate case of
"rehashing" that you can use here because you don't have to deal
with the possibility of making space for arbitrary insertions) and
then you have the jump address.

                                Bear


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Hans-Peter Diettrich  
View profile   Translate to Translated (View Original)
 More options 11 Nov, 07:52
Newsgroups: comp.compilers
From: Hans-Peter Diettrich <DrDiettri...@aol.com>
Date: Wed, 11 Nov 2009 08:52:58 +0100
Local: Wed 11 Nov 2009 07:52
Subject: Re: Switch statement code generation
Hans-Peter Diettrich schrieb:

> A string tree might give the best lookup performance.
> [If there's a lot of strings, a patricia trie might be faster than
> hashing or binary search since it doesn't require repeated scans of
> the string during the match process. -John]

Thanks for the "patricia trie" term :-)
Wikipedia redirects it to "Radix tree".

This is what I meant with "string tree".

DoDi


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Derek M. Jones  
View profile   Translate to Translated (View Original)
 More options 11 Nov, 12:39
Newsgroups: comp.compilers
From: "Derek M. Jones" <de...@knosof.co.uk>
Date: Wed, 11 Nov 2009 12:39:12 +0000
Local: Wed 11 Nov 2009 12:39
Subject: Re: Switch statement code generation
All,

Working out the best set of algorithms to use for mapping
switch statements to machine code obviously requires
information on switch statement usage in real code.

The latest issue of CVu, the magazine of he ACCU
www.accu.org, has an article by yours truely comparing
the usage of switch statements and nested if-else-if sequences
in C code.

You need to be an ACCU member to access the article (yearly
membership #25 or #35 if you want both magazines).  I will
make a pdf freely available in a month or so.

While I'm talking about stuff I have written, switch statement code
generation gets a mention in this:
shape-of-code.coding-guidelines.com/2009/09/register-vs-stack-based-vms/


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Anton Ertl  
View profile   Translate to Translated (View Original)
 More options 11 Nov, 19:42
Newsgroups: comp.compilers
From: an...@mips.complang.tuwien.ac.at (Anton Ertl)
Date: Wed, 11 Nov 2009 19:42:15 GMT
Local: Wed 11 Nov 2009 19:42
Subject: Re: Switch statement code generation

Hans-Peter Diettrich <DrDiettri...@aol.com> writes:
>nathan.mccau...@gmail.com schrieb:

>>>> I'm trying to put together a list of various methods to do code generate
>>>> for switch statements. ...

>> Has anyone seen tries used for the string-based switch statements?
...
>[If there's a lot of strings, a patricia trie might be faster than
>hashing or binary search since it doesn't require repeated scans of
>the string during the match process. -John]

It's very doubtful that it's faster than hashing.  In hashing you
traverse the tree once for computing the hash function and once for
the final match.  All other match attempts (and there are typically <1
of those) will typically mismatch at the first character, so no
complete scan is necessary.  You typically have to fetch one cache
line per match attempt (i.e., typically 1-2; if you use external
chaining, add another cache line fetch for the initial table lookup).

In contrast, in the trie method you have to chase a pointer for each
substring you look at; if there are a lot of string, it's probably at
every character for the first few characters.  And for each pointer
you chase, you will have to fetch a cache line.

Ok, if there is little competition for the D-cache, the first few
levels of the trie might end up in the D-cache.  OTOH, if a few words
are looked up often, all the lines necessary for their lookup will be
in the D-cache with a hash table, too.

Overall, I guess you can find cases where the patricia trie is faster
(an application that does little but looking up lots of different,
relatively short strings) but in most cases I think that hashing is
faster, assuming good implementations of each data structure.

For our switch statement, we have to consider another issue: the data
structures can be hard-coded; then at each node of the trie there will
be a conditional or an indirect branch; for the hash table we have an
indirect jump after computing the hash function, and then one or two
conditional branches afterward.  Depending on how repetetive and
predictable the strings coming in are, the branch predictors may
predict very well or very badly (i.e., 50% mispredicts for conditional
branches, 100% for indirect branches).

Anyway, the more branches there are, the worse the effect of bad
predictability will be (hitting the trie harder).  It may be better
not to hard-code the data structure to avoid some of the
mispredictions; in the end, though, you probably will perform one
indirect jump if you don't hard-code the data structure; I think that
hard-coding the hash table will not be worse than not hard-coding it,
but hard-coding the trie can be worse (depending on circumstances).

- anton
--
M. Anton Ertl
an...@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Chris F Clark  
View profile   Translate to Translated (View Original)
 More options 16 Nov, 04:11
Newsgroups: comp.compilers
From: Chris F Clark <c...@shell01.TheWorld.com>
Date: Sun, 15 Nov 2009 23:11:00 -0500
Subject: Re: Switch statement code generation
Hashes versus Tries:

We've done a fair amount of experimentation on hashing verus trie
building in our latest hardware design.  As about 4 characters, hashes
start to win. And, in our case, we can fit either in 1st level cache,
that we can lock against spills, so cache misses aren't a penalty,
which tend to make hashing an even better choice. Now, actual hardware
and instruction set choices move the line a little, but for long
enough strings hashing always wins.

Think of hashing as a way of rearranging the values, so that you can
do the computed goto jump and make 1 n-way decision.  The right hash
minimizes collisions withou introducing too many empty slots.

BTW,long ago, when working at Pr1me computer, the switch statement
optimization we did essentially reinvented a subset of B-trees without
knowing what they were.  We could mix tests and jump tables at each
level as we descended what was essentially a trie of the numeric
values.  We didn't have the vocabulary to describe it so nicely at the
time.

Hope this helps,
-Chris

*************************************************************************** ***
Chris Clark                  email: christopher.f.cl...@compiler-resources.com
Compiler Resources, Inc.  Web Site: http://world.std.com/~compres
23 Bailey Rd                 voice: (508) 435-5016
Berlin, MA  01503 USA      twitter: @intel_chris


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google