Board index » delphi » Fastcode Computed Targets

Fastcode Computed Targets


2006-01-25 03:31:18 PM
delphi26
Hi
We currently have 3 computed targets; Blended, RTL Replacement and Pascal.
I would like to make two polls like the ones for CPU targets.
I see these candidates:
Pascal RTL Replacement - Pascal - size penalty
Pascal - Pascal = Current Pascal target
RTL Replacement - IA32 - size penalty = Current RTL Replacement target
Blended IA32 - IA32
Blended IA32ext - IA32, IA32ext - size penalty
Blended IA32ext - IA32, IA32ext
Blended MMX - IA32, IA32ext, MMX - size penalty
Blended MMX - IA32, IA32ext, MMX = Current Blended target
Blended SSE - IA32, IA32ext, MMX, SSE - size penalty
Blended SSE - IA32, IA32ext, MMX, SSE
Blended SSE2 - IA32, IA32ext, MMX, SSE, SSE2 - size penalty
Blended SSE2 - IA32, IA32ext, MMX, SSE, SSE2
More suggestions?
Best regards
Dennis Kjaer Christensen
 
 

Re:Fastcode Computed Targets

Dennis writes:
Quote
Blended IA32ext - IA32, IA32ext
Hi Dennis,
a newbie question : what is difference between IA32 and IA32ext?
Jouni
--
The Fastcode Project: www.fastcodeproject.org/
 

Re:Fastcode Computed Targets

Hi
Quote
a newbie question : what is difference between IA32 and IA32ext?
Conditional moves etc. All instructions added to IA32 in Pentium and Pentium
Pro.
I think there is a list on our site somewhere.
Regards
Dennis
 

Re:Fastcode Computed Targets

Dennis writes:
Quote
>a newbie question : what is difference between IA32 and IA32ext?

Conditional moves etc. All instructions added to IA32 in Pentium and Pentium
Pro.

I think there is a list on our site somewhere.

Hi Dennis,
Thanks. Google search of IA32ext pointed to old page that doesn't
exists in new site. dennishomepage.gugs-cats.dk/Conventions.htm
Regards,
Jouni
--
The Fastcode Project: www.fastcodeproject.org/
 

Re:Fastcode Computed Targets

Hi
Quote
Thanks. Google search of IA32ext pointed to old page that doesn't
exists in new site. dennishomepage.gugs-cats.dk/Conventions.htm
That is unfortunate.
Somebody should communicate with Dennis L about all the problems with the
site.
Best regards
Dennis Kjaer Christensen
 

Re:Fastcode Computed Targets

Hi
Just started a poll about the number of computed targets for 2006.
Later I will make a poll about which targets to use.
1)Pascal RTL Replacement - Pascal - size penalty
2)Pascal - Pascal = Current Pascal target
3)RTL Replacement - IA32 - size penalty = Current RTL Replacement target
4)Blended IA32 - IA32
5)Blended IA32ext - IA32, IA32ext - size penalty
6)Blended IA32ext - IA32, IA32ext
7)Blended MMX - IA32, IA32ext, MMX - size penalty
8)Blended MMX - IA32, IA32ext, MMX = Current Blended target
9)Blended SSE - IA32, IA32ext, MMX, SSE - size penalty
10)Blended SSE - IA32, IA32ext, MMX, SSE
11)Blended SSE2 - IA32, IA32ext, MMX, SSE, SSE2 - size penalty
12)Blended SSE2 - IA32, IA32ext, MMX, SSE, SSE2
13)Blended SSE3 - IA32, IA32ext, MMX, SSE, SSE2, SSE3 - size penalty
14)Blended SSE3 - IA32, IA32ext, MMX, SSE, SSE2, SSE3
Best regards
Dennis Kjaer Christensen
 

Re:Fastcode Computed Targets

Hi All
Having many targets here will add very little work.
Make one spreadsheet with blended results, include all functions and sort
them. This is Blended as it is now. Then the fastest Pascal function will
win Pascal target, the fastest IA32 function will win IA32 etc.
The same goes for targets with size penalty. Here the additional work
includes finding the size of all functions - not only the IA32 functions.
I voted for 8 computed targets.
Best regards
Dennis Kjaer Christensen
 

Re:Fastcode Computed Targets

Hi Community
8 computed targets is in the winner position, but far to few votes are in.
New poll created about which cmputed targets should be used.
Please vote for how many computed targets you want and which ones you want.
Best regards
Dennis Kjaer Christensen
 

Re:Fastcode Computed Targets

Hi
Feel free to tell us why you vote as you did.
Best regards
Dennis Kjaer Christensen
 

Re:Fastcode Computed Targets

Dennis writes:
Quote

Feel free to tell us why you vote as you did.

Hi,
I voted 6 and chose current targets + blended SSE +
blended SSE2. If you look at the current (top) functions,
SSE3 isn't used much. SSE, SSE2 are quite common in current
hardware so why not add those to new computed targets.
Regards,
Jouni
--
The Fastcode Project: www.fastcodeproject.org/
 

Re:Fastcode Computed Targets

Hi
Good.
What is the general opinion about size penalties?
I think that it is ok to have some targets with optimization against size
too, but we have a problem getting the actual penalties defined. I have
begged for peoples opinions about the size penalties for the RTL replacement
target in old challenges and nobody helped me select some proper constants.
Best regards
Dennis Kjaer Christensen
 

Re:Fastcode Computed Targets

Hi Community
Status
No of computed targets = 8.
Included
RTL Replacement - IA32 - size penalty = Current RTL Replacement target - 4
votes
Pascal RTL Replacement - Pascal - size penalty - 3 votes
Pascal - Pascal = Current Pascal target - 3 votes
Blended MMX - IA32, IA32ext, MMX = Current Blended target - 3 votes
Blended SSE - IA32, IA32ext, MMX, SSE - size penalty - 3 votes
Blended SSE - IA32, IA32ext, MMX, SSE - 3 votes
Undecided
Blended IA32 - IA32 - 2 votes
Blended MMX - IA32, IA32ext, MMX - size penalty - 2 votes
Blended SSE2 - IA32, IA32ext, MMX, SSE, SSE2 - size penalty - 2 votes
Blended SSE2 - IA32, IA32ext, MMX, SSE, SSE2 - 2 votes
Excluded
Blended IA32ext - IA32, IA32ext - size penalty - 1 vote
Blended SSE3 - IA32, IA32ext, MMX, SSE, SSE2, SSE3 - size penalty - 1 vote
Blended SSE3 - IA32, IA32ext, MMX, SSE, SSE2, SSE3 - 1 vote
Blended IA32ext - IA32, IA32ext - 0 votes
Many more votes needed.
Best regards
Dennis Kjaer Christensen
 

Re:Fastcode Computed Targets

Hi
We have only 30 votes in from 4 people. We have 8 votes each and somebody
forgot to use 2 of them.
Best regards
Dennis Kjaer Christensen
 

Re:Fastcode Computed Targets

Hi Community
Status
No of computed targets = 8.
Included
RTL Replacement - IA32 - size penalty = Current RTL Replacement target - 5
votes
Pascal RTL Replacement - Pascal - size penalty - 5 votes
Pascal - Pascal = Current Pascal target - 4 votes
Blended MMX - IA32, IA32ext, MMX = Current Blended target - 4 votes
Blended SSE - IA32, IA32ext, MMX, SSE - 4 votes
Blended SSE - IA32, IA32ext, MMX, SSE - size penalty - 3 votes
Blended IA32 - IA32 - 3 votes
Blended SSE2 - IA32, IA32ext, MMX, SSE, SSE2 - 3 votes
Excluded
Blended MMX - IA32, IA32ext, MMX - size penalty - 2 votes
Blended SSE2 - IA32, IA32ext, MMX, SSE, SSE2 - size penalty - 2 votes
Blended SSE3 - IA32, IA32ext, MMX, SSE, SSE2, SSE3 - 2 vote
Blended SSE3 - IA32, IA32ext, MMX, SSE, SSE2, SSE3 - size penalty - 1 vote
Blended IA32ext - IA32, IA32ext - 0 votes
Blended IA32ext - IA32, IA32ext - size penalty - 0 vote
Many more votes needed.
Remember that each person has 8 votes.
Best regards
Dennis Kjaer Christensen
 

Re:Fastcode Computed Targets

Hi Community
Status
No of computed targets = 8.
Included
RTL Replacement - IA32 - size penalty = Current RTL Replacement target - 7
votes
Pascal RTL Replacement - Pascal - size penalty - 7 votes
Pascal - Pascal = Current Pascal target - 6 votes
Blended MMX - IA32, IA32ext, MMX = Current Blended target - 6 votes
Blended SSE - IA32, IA32ext, MMX, SSE - 6 votes
Blended SSE2 - IA32, IA32ext, MMX, SSE, SSE2 - 5 votes
Blended IA32 - IA32 - 5 votes
Undecissive
Blended SSE2 - IA32, IA32ext, MMX, SSE, SSE2 - size penalty - 4 votes
Blended SSE3 - IA32, IA32ext, MMX, SSE, SSE2, SSE3 - 4 votes
Blended SSE - IA32, IA32ext, MMX, SSE - size penalty - 4 votes
Excluded
Blended MMX - IA32, IA32ext, MMX - size penalty - 3 votes
Blended SSE3 - IA32, IA32ext, MMX, SSE, SSE2, SSE3 - size penalty - 2 vote
Blended IA32ext - IA32, IA32ext - 1 votes
Blended IA32ext - IA32, IA32ext - size penalty - 1 votes
More votes needed.
Best regards
Dennis Kjaer Christensen