View previous topic :: View next topic |
Author |
Message |
evsource
Joined: 21 Nov 2006 Posts: 129
|
Calling all efficient coders! Fast divide equation needed! |
Posted: Tue Mar 05, 2013 6:40 pm |
|
|
Hi all,
I have an equation that I'm trying to execute in under 50us. I'm beginning to doubt it's possible. If I can't get in under that, I can still benefit from the fastest implementation possible. Hopefully a discussion about this will help others in the future to see ways they can solve their speed problems. I've already tried to employ several things discussed in other posts on this forum, e.g. its all fixed point math.
The chip being used is the 18F2321. In the actual program this code will run in, it's currently at 17% RAM and 22% ROM as reported by PCH 4.119.
So here's the equation:
current = (((636 * ADC_reading) - 325632) / drive ) + 512
The best I've been able to do do is about 126us as reported by the MPLAB stopwatch. Here's how I did it:
Code: | #include <18f2321.h>
#device ADC=10 // use 10 bit ADC reading
#fuses H4,WDT256,NOWDT,NOMCLR,NOBROWNOUT,NOLVP,NOPUT
#use delay(clock=40000000)
void main() {
int16 drive; // ranges between 0 and 636
int16 ADC_reading; // ranges between 127 and 512
signed int16 current; // ranges between 127 and 512, stored as signed because used later with the sign required
signed int32 temp_current;
drive = 318; // example value used
ADC_reading = 474; // example value used
temp_current = (signed int32)ADC_reading * 636; // int32 multiply, 22us
temp_current = temp_current - 325632;
current = (signed int16)(temp_current / drive); // int32 divide, 106us
current = current + 512; // should result in 436 with the example values used for drive and ADC_reading
} |
I realize it's the int32 multiply and divide that are the big offenders. I can't see many ways around them. One trick I thought of that could get it down by the 10's of microseconds is a lookup table for the int32 multiply (first line of the computations). It would require an array of 385 int32's which won't fit in RAM (maybe in ROM though?). I'd rather leave as much free space as possible for future features (why I gave that much headroom to begin with). But as long as it will fit on the chip given the current utilization, that's fine. I can eventually move to a chip with more memory if needed. But for this discussion, it needs to be the current chip.
[EDIT] Some might have seen the "contest" I was trying to make of this. I see now that's discouraged. Original offer is still on to anyone that wants that saw it, but to keep peace with the moderator, I pulled it. I hope you will all still help me here!
Last edited by evsource on Tue Mar 05, 2013 10:35 pm; edited 1 time in total |
|
|
evsource
Joined: 21 Nov 2006 Posts: 129
|
Re: Calling all efficient coders! Contest for speedy equatio |
Posted: Tue Mar 05, 2013 6:56 pm |
|
|
evsource wrote: | I'm thinking it's quite likely I'm missing something that could substantially reduce the time. |
Well, I just found a way to knock off 13us. So I'm feeling a little more confident that there are other ways to get this time down! Good luck everyone! (I'll share what I did later if someone else doesn't mention it).
++++++++++++++++++
Ask for help. Do not offer payment in forum.
Forum rule #8
8. No spamming or solicitation
- Forum Moderator
++++++++++++++++++ |
|
|
newguy
Joined: 24 Jun 2004 Posts: 1909
|
|
Posted: Tue Mar 05, 2013 9:37 pm |
|
|
First: I don't want any $.
One thing I did on a commercial project (very time sensitive A/D and computation loop) was to break a divide down into a sum of divide-by-2's.
For example, /3 can be approximated by: numerator divided by 4 (>> 2), plus numerator divided by 16 (>> 4), plus (3 * numerator) divided by 128 (>> 7). This gives an overall factor of 0.3359375 (close to 0.33333).
Play your cards right and you can set things up to take advantage of simple right shifts and sums. Very speedy. |
|
|
evsource
Joined: 21 Nov 2006 Posts: 129
|
Re: Calling all efficient coders! Contest for speedy equatio |
Posted: Tue Mar 05, 2013 10:44 pm |
|
|
evsource wrote: | evsource wrote: | I'm thinking it's quite likely I'm missing something that could substantially reduce the time. |
Well, I just found a way to knock off 13us.
|
How I reduced the time by 13us was with the multiply operation:
Code: | //temp_current = (signed int32)ADC_reading * 636; // int32 multiply, 22us
temp_current = ((signed int32)ADC_reading << 9) + ((signed int32) ADC_reading << 7) - 1908; |
Quote: |
++++++++++++++++++
Ask for help. Do not offer payment in forum.
Forum rule #8
8. No spamming or solicitation
- Forum Moderator
++++++++++++++++++ |
You might elaborate on "solicitation" in the forum rules as you did here. I did actually look through the rules to make sure I wasn't doing something inappropriate. Solicitation and offering rewards don't seem to fall so obviously into the same definition. |
|
|
evsource
Joined: 21 Nov 2006 Posts: 129
|
|
Posted: Tue Mar 05, 2013 10:57 pm |
|
|
newguy wrote: |
One thing I did on a commercial project (very time sensitive A/D and computation loop) was to break a divide down into a sum of divide-by-2's.
|
Great suggestion, thank you.
What I'm failing to see is how this can be applied when the denominator isn't a known, fixed value (as it is in my particular case). |
|
|
newguy
Joined: 24 Jun 2004 Posts: 1909
|
|
Posted: Wed Mar 06, 2013 12:29 am |
|
|
Whoops, sorry. Skimmed your post and I mentally swapped the position of "drive" and the constant. |
|
|
twidget
Joined: 21 Feb 2013 Posts: 32 Location: Orange County, California
|
|
Posted: Wed Mar 06, 2013 2:57 am |
|
|
Can I ask a silly question? Why is it important to get your calculations down under 50us? Its not very often that such a fast speed has any real effect on a system, so I am curious why this is a problem.
also, wouldn't programming the calculations in assembly ensure your code is as fast as possible? |
|
|
RF_Developer
Joined: 07 Feb 2011 Posts: 839
|
Re: Calling all efficient coders! Fast divide equation neede |
Posted: Wed Mar 06, 2013 3:14 am |
|
|
evsource wrote: |
current = (((636 * ADC_reading) - 325632) / drive ) + 512
|
As always, optimisation is better attempted at algorithm level. So I looked at the equation to see where optimisations could be made. The equation simplifies to:
current = (636/drive) * (ADC_reading - 512) + 512
As drive varies from 0 to 636 then (636/drive) varies from 1 to infinity (it actually jumps from 636 with drive = 1 to infinity at drive = 0). Infinities worry me - a lot - and they don't compute well to say the least. It suggests something is wrong. Maybe its just the range of drive.
At this point I though it best to get back to you and check what you are doing is right.
RF Developer |
|
|
asmboy
Joined: 20 Nov 2007 Posts: 2128 Location: albany ny
|
|
Posted: Thu Mar 07, 2013 9:35 am |
|
|
is this a joke ???
1) first the OBVIOUS risk of divide by zero in drive's lower bound.
2) this term:
(((636 * ADC_reading) - 325632)
which by your own admission, if the limits of ADC_reading are true,
makes no sense.
by itself it still only commutes/evaluates to --> (adc_reading-512)*636
and what the reason is for that 636 integer is not clear at all.
I can see what YOU want to do by way of a signed pivot point offset ,
BUT
the ULTIMATE resolution of the result is still a meager 1 part in 385 and can't be improved by any math you throw at it. and with typical +/-1 bit error this is barely a 1% accuracy/confidence reading
scaling up into 32bits does not improve the low information resolution of the adc reading itself. |
|
|
|