I've just done a little benchmarking of the ray tracer written in Scheme and compiled using Stalin. Here are the results. On x86 (900MHz Athlon T-bird):
$ g++ -O3 -march=athlon-tbird -ffast-math ray.cpp -o ray $ time ./ray 6 160 >image.pgm real 0m2.152s
$ mlton ray.sml $ time ./ray 6 160 >image.pgm real 0m2.435s
$ ocamlopt -inline 100 -ffast-math ray.ml -o ray $ time ./ray 6 160 >image.pgm real 0m3.255s
$ stalin -d0 -d1 -d5 -d6 -On -q -d -architecture IA32-align-double -no-clone-size-limit -split-even-if-no-widening -copt -O2 -copt -fomit-frame-pointer -copt -malign-double ray $ time ./ray 6 160 >image.pgm real 0m3.712s
On AMD64 (1.8GHz Athlon64):
$ g++ -O3 -march=athlon-tbird -ffast-math ray.cpp -o ray $ time ./ray 6 160 >image.pgm real 0m0.987s
$ mlton ray.sml $ time ./ray 6 160 >image.pgm real 0m1.056s
$ ocamlopt -inline 100 -ffast-math ray.ml -o ray $ time ./ray 6 160 >image.pgm real 0m1.037s
$ stalin -d0 -d1 -d5 -d6 -On -q -d -architecture IA32-align-double -no-clone-size-limit -split-even-if-no-widening -copt -O2 -copt -fomit-frame-pointer -copt -malign-double ray $ time ./ray 6 160 >image.pgm real 0m1.773s
I hadn't expected a simple Lisp or Scheme implementation to be able to compete in terms of performance but only 70% slower on 32-bit AMD64 when C++ and OCaml are fully 64-bit is very impressive, IMHO.
Jon Harrop wrote: > I hadn't expected a simple Lisp or Scheme implementation to be able to > compete in terms of performance but only 70% slower on 32-bit AMD64 when > C++ and OCaml are fully 64-bit is very impressive, IMHO.
There are things I don't like about Stalin (related mostly to its "static" nature and skimpy error handling), but raw speed is one thing it's terribly good at. A *LOT* of work has gone into the Stalin compiler to make it compile code that runs fast. It is not an example of a "simple" implementation.
Ray Dillinger wrote: > There are things I don't like about Stalin (related mostly to its > "static" nature and skimpy error handling), but raw speed is one > thing it's terribly good at.
As a Lisp/Scheme virgin, I'm finding Stalin much easier going than the other compilers I've tried (Bigloo, SBCL, CMUCL). Bigloo in particular has the worst error reporting I've ever seen...
> A *LOT* of work has gone into the > Stalin compiler to make it compile code that runs fast. It is not > an example of a "simple" implementation.
> I hadn't expected a simple Lisp or Scheme implementation to be able to > compete in terms of performance but only 70% slower on 32-bit AMD64 when > C++ and OCaml are fully 64-bit is very impressive, IMHO.
Stalin is a very sophsticated implementation that uses whole program compilation algorithms. It's very much like MLton. Chicken Scheme is what I'd call a "simple" implementation.
> I've just done a little benchmarking of the ray tracer written in Scheme and > compiled using Stalin. Here are the results. On x86 (900MHz Athlon T-bird):
> $ g++ -O3 -march=athlon-tbird -ffast-math ray.cpp -o ray > $ time ./ray 6 160 >image.pgm > real 0m2.152s
> $ mlton ray.sml > $ time ./ray 6 160 >image.pgm > real 0m2.435s
> $ ocamlopt -inline 100 -ffast-math ray.ml -o ray > $ time ./ray 6 160 >image.pgm > real 0m3.255s
> $ stalin -d0 -d1 -d5 -d6 -On -q -d -architecture IA32-align-double > -no-clone-size-limit -split-even-if-no-widening -copt -O2 -copt > -fomit-frame-pointer -copt -malign-double ray > $ time ./ray 6 160 >image.pgm > real 0m3.712s
> On AMD64 (1.8GHz Athlon64):
> $ g++ -O3 -march=athlon-tbird -ffast-math ray.cpp -o ray > $ time ./ray 6 160 >image.pgm > real 0m0.987s
> $ mlton ray.sml > $ time ./ray 6 160 >image.pgm > real 0m1.056s
> $ ocamlopt -inline 100 -ffast-math ray.ml -o ray > $ time ./ray 6 160 >image.pgm > real 0m1.037s
> $ stalin -d0 -d1 -d5 -d6 -On -q -d -architecture IA32-align-double > -no-clone-size-limit -split-even-if-no-widening -copt -O2 -copt > -fomit-frame-pointer -copt -malign-double ray > $ time ./ray 6 160 >image.pgm > real 0m1.773s
> I hadn't expected a simple Lisp or Scheme implementation to be able to > compete in terms of performance but only 70% slower on 32-bit AMD64 when > C++ and OCaml are fully 64-bit is very impressive, IMHO.
Its a shame theres so much difference. Would be interesting to see what results cmucl or sbcl gives. Have you posted a request to comp.lang.lisp?
Anyway, perhaps you could try the following options for stalin? I got better results with these:
Jon Harrop wrote: > I hadn't expected a simple Lisp or Scheme implementation to be able to > compete in terms of performance but only 70% slower on 32-bit AMD64 when > C++ and OCaml are fully 64-bit is very impressive, IMHO.
It is also worth noting the Stalin code has no type declarations.
Jens Axel Søgaard wrote: > Jon Harrop wrote: >> I hadn't expected a simple Lisp or Scheme implementation to be able to >> compete in terms of performance but only 70% slower on 32-bit AMD64 when >> C++ and OCaml are fully 64-bit is very impressive, IMHO.
> It is also worth noting the Stalin code has no type declarations.
Well, it uses "vec" and "list" but I'm not sure what you'd call a type declaration.
Daniel C. Wang wrote: > {stuff deleted} >> I hadn't expected a simple Lisp or Scheme implementation to be able to >> compete in terms of performance but only 70% slower on 32-bit AMD64 when >> C++ and OCaml are fully 64-bit is very impressive, IMHO.
> Stalin is a very sophsticated implementation that uses whole program > compilation algorithms. It's very much like MLton. Chicken Scheme is > what I'd call a "simple" implementation.
I just remembered what I meant when I wrote that: "I hadn't expected a simple implementation of the ray tracer written in Lisp or Scheme to be able to compete in terms of performance...".
In other words, I was expecting fast Lisp/Scheme implementations of the ray tracer to be obfuscated compared to the other languages but that doesn't seem to be the case.
Jon Harrop wrote: > I've just done a little benchmarking of the ray tracer written in Scheme and > compiled using Stalin. Here are the results. On x86 (900MHz Athlon T-bird):
Which Scheme version did you use? Is it the same as the one you used under Bigloo (except for the +fl, etc. operators).
> > I hadn't expected a simple Lisp or Scheme implementation to be able to > > compete in terms of performance but only 70% slower on 32-bit AMD64 when > > C++ and OCaml are fully 64-bit is very impressive, IMHO.
> It is also worth noting the Stalin code has no type declarations.
There is one bug which hit me: Stalin takes a rather, rather long time for compiling. Maybe this issue has changed in the meantime. I once used Stalin on my old laptop under SuSE Linux 8.0 and every program took at least 33 seconds when compiling. That was rather tedious; at least for smaller programs.