|
|
View previous topic :: View next topic |
Author |
Message |
viki2000
Joined: 08 May 2013 Posts: 233
|
Operations efficiency – Memory vs. CPU speed |
Posted: Sun Jun 04, 2017 5:04 pm |
|
|
I use PIC24HJ64GP202 or generally a PIC24 on 16 bit MCU.
This is my working code:
Code: | void main()
{
float x;
signed int16 y;
while(TRUE)
{
for(x=0; x<2*PI; x+=PI/32768){
y = 32767*sin(x);
putc(make8(y,1)); //MSB
putc(make8(y,0)); //LSB
}
}
} |
In the loop for I have some arithmetical operations.
“PI” is a constant (3.14) known by CCS C compiler, but I do not know with how many decimals is specified by CCS.
1) Can someone tell?
Here are my questions:
2) Every time when the “for loop” is passed, does the CPU makes the multiplication “2*PI” and division “PI/32768”?
3) Is it any other better way to do it?
4) If I allocate a space in memory for a constant with fixed values as “CONST1=2*PI” as “CONST2=”PI/32768”, then is the execution of the program faster?
Code: |
void main()
{
CONST1=2*PI
CONST2=”PI/32768
float x;
signed int16 y;
while(TRUE)
{
for(x=0; x<CONST1; x+=CONST2){
y = 32767*sin(x);
putc(make8(y,1)); //MSB
putc(make8(y,0)); //LSB
}
}
} |
The differences between allocation a constant or making the operations in the “for loop”, I guess is next:
- Using constants needs more fixed space in ROM memory, but the CPU executes faster the operations, the “for loop”.
- Not using constants saves ROM memory, but CPU is slower, needs to execute the arithmetical operations each time when the loop is passed.
5) Is the above a good view/understanding of what happens?
6) Does these arithmetical operations from “for loop” use temporal variables? In RAM?
7) What is the best way to do it, to write the code? What can be considered more efficient? |
|
|
PCM programmer
Joined: 06 Sep 2003 Posts: 21708
|
|
Posted: Sun Jun 04, 2017 5:43 pm |
|
|
1. It's in math.h. It's right there:
Code: | #define PI 3.1415926535897932 |
Last edited by PCM programmer on Sun Jun 04, 2017 5:49 pm; edited 1 time in total |
|
|
viki2000
Joined: 08 May 2013 Posts: 233
|
|
Posted: Sun Jun 04, 2017 5:48 pm |
|
|
Thank you. I did not checked that.
What about the other questions? Any suggestions? |
|
|
PCM programmer
Joined: 06 Sep 2003 Posts: 21708
|
|
Posted: Sun Jun 04, 2017 7:01 pm |
|
|
First make a short section of code so you can see what the pre-calculated
values are:
Code: | .................... x = PI;
00032: MOVLW DB
00034: MOVWF x+3
00036: MOVLW 0F
00038: MOVWF x+2
0003A: MOVLW 49
0003C: MOVWF x+1
0003E: MOVLW 80
00040: MOVWF x
.................... x = 2*PI;
00042: MOVLW DB
00044: MOVWF x+3
00046: MOVLW 0F
00048: MOVWF x+2
0004A: MOVLW 49
0004C: MOVWF x+1
0004E: MOVLW 81
00050: MOVWF x
.................... x = PI/32768;
00052: MOVLW DB
00054: MOVWF x+3
00056: MOVLW 0F
00058: MOVWF x+2
0005A: MOVLW 49
0005C: MOVWF x+1
0005E: MOVLW 71
00060: MOVWF x |
Then look in the for() loop code.
The pre-calculated 2*PI value is used in the for() loop below:
Quote: |
.................... for(x=0; x<2*PI; x+=PI/32768){
00826: CLRF x+3
00828: CLRF x+2
0082A: CLRF x+1
0082C: CLRF x
0082E: MOVFF x+3,??65535+3
00832: MOVFF x+2,??65535+2
00836: MOVFF x+1,??65535+1
0083A: MOVFF x,??65535
0083E: MOVLW DB
00840: MOVWF @FLT.P1+3
00842: MOVLW 0F
00844: MOVWF @FLT.P1+2
00846: MOVLW 49
00848: MOVWF @FLT.P1+1
0084A: MOVLW 81
0084C: MOVWF @FLT.P10084E: MOVLB 0
00850: CALL @FLT
00854: BNC 08EC |
The pre-calculated value of PI/32768 is added to x, as shown
in the code at the end of the for() loop:
Quote: |
0083E: MOVLW DB
008C6: MOVWF @ADDFF.P1+3
008C8: MOVLW 0F
008CA: MOVWF @ADDFF.P1+2
008CC: MOVLW 49
008CE: MOVWF @ADDFF.P1+1
008D0: MOVLW 71
008D2: MOVWF @ADDFF.P1008D4: CALL @ADDFF
008D8: MOVFF 03,x+3
008DC: MOVFF 02,x+2
008E0: MOVFF 01,x+1
008E4: MOVFF 00,x
008E8: MOVLB F
008EA: BRA 082E |
So, yes it is using numbers that are calculated at compile-time, and
not at run-time. |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19589
|
|
Posted: Mon Jun 05, 2017 12:46 am |
|
|
Several 'missing points' here:
First you worry about the accuracy of Pi, while you are doing the maths in float (not double).
Then your output values are only 15bit (ignoring the sign), so the accuracy of Pi doesn't matter at all...
Then why waste time calculating for all four quadrants?. Remember that you only need the sin values for one quadrant.
Just use a simple integer Cordic implementation:
<http://www.dcs.gla.ac.uk/~jhw/cordic/cordic-16bit.h>
It is worth understanding 'why' they use 16384, rather than 32767 as the 'unit' value for the integer maths. It makes a lot of other things much simpler, so should be considered. However the basic algorithm, can be easily ported to your 32767 value if required. |
|
|
viki2000
Joined: 08 May 2013 Posts: 233
|
|
Posted: Mon Jun 05, 2017 1:58 am |
|
|
I was wondering about accuracy of PI only because I did not know to search its definition as constant in “math.h”, but I see now that has 17 decimals. The question came into my mind because in PCD manual page 191, there is an example of PI mentioned directly, not as constant, as 3.141596 only:
https://www.ccsinfo.com/downloads/PCDReferenceManual.pdf
Now, understanding that the code is using numbers that are calculated at compile-time and not at run-time, it means that does not matter for the MCU operations if I define these numbers as constants in the beginning. Probably is only easier to follow the code, but will be no other improvements.
I used the sine calculations for all quadrants, because it seems lots easier to sweep the range 0 to 2*PI in the for() loop, rather than use only one quadrant from 0 to PI/2 and then add additional code with other arithmetical calculations and decisions.
In what way do you think will bring any improvements to the code if I use only 1 quadrant? I do not think will be shorter, but would be then faster?
Thank you for 16 bit CORDIC algorithm. I will try to study it and then implement in a fixed point math.h library that will be included in the main project. |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19589
|
|
Posted: Mon Jun 05, 2017 3:43 am |
|
|
It already is a fixed point library....
It's just they use 16384, as 1.00
There are significant reasons for this, and you should consider these for your application. |
|
|
viki2000
Joined: 08 May 2013 Posts: 233
|
|
Posted: Tue Jun 06, 2017 1:04 am |
|
|
What is your experience with "Fixed point decimal" mentioned here?
https://www.ccsinfo.com/content.php?page=compiler-features#fixed
It says
Quote: | "Fixed point decimal gives you decimal representation, but at integer speed. This gives you a phenomenal speed boost over using float" |
Does it make sense to try to include it in the trials above? |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19589
|
|
Posted: Tue Jun 06, 2017 1:47 am |
|
|
Fixed point decimal, is what the %w printf format displays in CCS.
You treat an integer as if (for example), it is integer 'thousandths'. So 1000, becomes '1.000' etc..
Big advantages are for things like simple arithmetic for currency, where it is unacceptable that (for instance), 200000.04 + 200000.03 = 400000.06, which is what floating point will give. If instead you use int32, and work directly in 'integer cents', then the standard + - etc., will all work as normal, with the speed advantages of integer arithmetic, and the %w specifier will allow you to print as if they are 'floating' values.
However downsides arrive for this type of format, is when you are not working for 'human' consumption. So (for instance), if you multiply two values together, since each is 100* the actual value, the result will have to be divided by 100 after the multiplication. Since 100 is not an easy binary division, this brings a time cost (not as much as float, but significant). This is why _binary_ point decimal is far preferred for anything involving multiplication or division. Here you code the numbers so that there are a number of binary digits in front of and behind the decimal. This is why the int16 sine example I pointed you to, uses 16384 as '1'. They are using 14 binary digits after the decimal point, and one in front, plus the sign. This then means the divisions and multiplications needed to keep the number correctly scaled can be done by simple rotations - far easier for the processor to do. |
|
|
viki2000
Joined: 08 May 2013 Posts: 233
|
|
|
viki2000
Joined: 08 May 2013 Posts: 233
|
|
Posted: Fri Jun 09, 2017 5:46 am |
|
|
In the given test example from here:
http://www.dcs.gla.ac.uk/~jhw/cordic/cordic-test.c
Code: | #include "cordic-32bit.h"
#include <math.h> // for testing only!
//Print out sin(x) vs fp CORDIC sin(x)
int main(int argc, char **argv)
{
double p;
int s,c;
int i;
for(i=0;i<50;i++)
{
p = (i/50.0)*M_PI/2;
//use 32 iterations
cordic((p*MUL), &s, &c, 32);
//these values should be nearly equal
printf("%f : %f\n", s/MUL, sin(p));
}
} |
What is it M_PI ?
In CORDIC the value of the angle is specified in degrees, not radians.
Then the example above has 50 steps in the for() loop for "i" and refers to PI/2, probably to test 1 quadrant.
Then is it M_PI/2 = 90° , or just 90 as number, or what exactly?
From what comes that M in front of PI ? |
|
|
temtronic
Joined: 01 Jul 2010 Posts: 9269 Location: Greensville,Ontario
|
|
Posted: Fri Jun 09, 2017 6:34 am |
|
|
My 'guess' is that M_PI is a variable that is 'created' in the
#include "cordic-32bit.h"
file that is at the top of the program.
OR
it could be in the math.h file....
Either way, if the program comiples it HAS to be in one or the other !
Jay
edit> had a quick F3 of math.h, didn't find M_PI has to be in cordic..... |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19589
|
|
Posted: Fri Jun 09, 2017 7:17 am |
|
|
M_PI is their define for PI.
#define M_PI 3.1415926535897932384626
#define K1 0.6072529350088812561694
Honestly, unless you have some need for great accuracy, keep to the 16bit implementation. 32bit is not twice as slow as 16bit. An int32 takes more than 10* the time needed for an int16 multiply, while a division takes more than 20* the time.
For any simple thing like sinusoidal synthesis, 16bit is more than accurate enough. The odds are that any measurement based upon for instance an ADC, will only have perhaps 3 digits of actual resolution, and wasting time doing maths much beyond this accuracy is fundamentally pointless. |
|
|
viki2000
Joined: 08 May 2013 Posts: 233
|
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19589
|
|
Posted: Fri Jun 09, 2017 2:32 pm |
|
|
As a further comment to this, you might be interested in the following:
Code: |
//A fast float approximation to sin over the range +/-PI
#define PI2 (PI*PI)
#define INV4 (4/PI)
#define INV4SQ (-4/(PI2))
#define P 0.225
float fast_sin(float x)
{
float y;
y = (INV4 * x) + (INV4SQ * x * fabs(x));
return P * (y * fabs(y) - y) + y;
}
void main(void)
{
//Test the fast sin algorithm for angles from 0 to PI in steps of PI/128
float an, res, sres,;
for (an=0.0;an<(PI);an+=(PI/128))
{
//res1=fast_sin1(an);
res=fast_sin(an);
sres=sin(an);
printf ("AN=%5.3f sinfast=%5.3f sin=%5.3f\n",an,res,sres);
}
while(TRUE)
delay_cycles(1);
}
|
I have in the past published a fast approximation to atan2 in the code library.
This one is only in error on the third decimal:
Code: |
AN=0.000 sinfast=0.000 sin=0.000
AN=0.024 sinfast=0.024 sin=0.024
AN=0.049 sinfast=0.048 sin=0.049
AN=0.073 sinfast=0.072 sin=0.073
AN=0.098 sinfast=0.097 sin=0.098
AN=0.122 sinfast=0.121 sin=0.122
AN=0.147 sinfast=0.145 sin=0.146
AN=0.171 sinfast=0.169 sin=0.170
AN=0.196 sinfast=0.194 sin=0.195
AN=0.220 sinfast=0.218 sin=0.219
AN=0.245 sinfast=0.241 sin=0.242
AN=0.269 sinfast=0.265 sin=0.266
AN=0.294 sinfast=0.289 sin=0.290
AN=0.319 sinfast=0.312 sin=0.313
AN=0.343 sinfast=0.336 sin=0.336
AN=0.368 sinfast=0.359 sin=0.359
AN=0.392 sinfast=0.382 sin=0.382
AN=0.417 sinfast=0.404 sin=0.405
AN=0.441 sinfast=0.427 sin=0.427
AN=0.466 sinfast=0.449 sin=0.449
AN=0.490 sinfast=0.471 sin=0.471
AN=0.515 sinfast=0.492 sin=0.492
AN=0.539 sinfast=0.514 sin=0.514
AN=0.564 sinfast=0.535 sin=0.534
AN=0.589 sinfast=0.555 sin=0.555
AN=0.613 sinfast=0.576 sin=0.575
AN=0.638 sinfast=0.596 sin=0.595
AN=0.662 sinfast=0.615 sin=0.615
AN=0.687 sinfast=0.634 sin=0.634
AN=0.711 sinfast=0.653 sin=0.653
AN=0.736 sinfast=0.672 sin=0.671
AN=0.760 sinfast=0.690 sin=0.689
AN=0.785 sinfast=0.707 sin=0.707
AN=0.809 sinfast=0.724 sin=0.724
AN=0.834 sinfast=0.741 sin=0.740
AN=0.859 sinfast=0.757 sin=0.757
AN=0.883 sinfast=0.773 sin=0.773
AN=0.908 sinfast=0.789 sin=0.788
AN=0.932 sinfast=0.803 sin=0.803
AN=0.957 sinfast=0.818 sin=0.817
AN=0.981 sinfast=0.832 sin=0.831
AN=1.006 sinfast=0.845 sin=0.844
AN=1.030 sinfast=0.858 sin=0.857
AN=1.055 sinfast=0.870 sin=0.870
AN=1.079 sinfast=0.882 sin=0.881
AN=1.104 sinfast=0.893 sin=0.893
AN=1.129 sinfast=0.904 sin=0.903
AN=1.153 sinfast=0.914 sin=0.914
AN=1.178 sinfast=0.924 sin=0.923
AN=1.202 sinfast=0.933 sin=0.932
AN=1.227 sinfast=0.941 sin=0.941
AN=1.251 sinfast=0.949 sin=0.949
AN=1.276 sinfast=0.957 sin=0.956
AN=1.300 sinfast=0.964 sin=0.963
AN=1.325 sinfast=0.970 sin=0.970
AN=1.349 sinfast=0.975 sin=0.975
AN=1.374 sinfast=0.980 sin=0.980
AN=1.398 sinfast=0.985 sin=0.985
AN=1.423 sinfast=0.989 sin=0.989
AN=1.448 sinfast=0.992 sin=0.992
AN=1.472 sinfast=0.995 sin=0.995
AN=1.497 sinfast=0.997 sin=0.997
AN=1.521 sinfast=0.998 sin=0.998
AN=1.546 sinfast=0.999 sin=0.999
AN=1.570 sinfast=1.000 sin=1.000
AN=1.595 sinfast=0.999 sin=0.999
AN=1.619 sinfast=0.998 sin=0.998
AN=1.644 sinfast=0.997 sin=0.997
AN=1.668 sinfast=0.995 sin=0.995
AN=1.693 sinfast=0.992 sin=0.992
AN=1.718 sinfast=0.989 sin=0.989
AN=1.742 sinfast=0.985 sin=0.985
AN=1.767 sinfast=0.980 sin=0.980
AN=1.791 sinfast=0.975 sin=0.975
AN=1.816 sinfast=0.970 sin=0.970
AN=1.840 sinfast=0.964 sin=0.963
AN=1.865 sinfast=0.957 sin=0.956
AN=1.889 sinfast=0.949 sin=0.949
AN=1.914 sinfast=0.941 sin=0.941
AN=1.938 sinfast=0.933 sin=0.932
AN=1.963 sinfast=0.924 sin=0.923
AN=1.988 sinfast=0.914 sin=0.914
AN=2.012 sinfast=0.904 sin=0.903
AN=2.037 sinfast=0.893 sin=0.893
AN=2.061 sinfast=0.882 sin=0.881
AN=2.086 sinfast=0.870 sin=0.870
AN=2.110 sinfast=0.858 sin=0.857
AN=2.135 sinfast=0.845 sin=0.844
AN=2.159 sinfast=0.832 sin=0.831
AN=2.184 sinfast=0.818 sin=0.817
AN=2.208 sinfast=0.803 sin=0.803
AN=2.233 sinfast=0.789 sin=0.788
AN=2.258 sinfast=0.773 sin=0.773
AN=2.282 sinfast=0.757 sin=0.757
AN=2.307 sinfast=0.741 sin=0.740
AN=2.331 sinfast=0.724 sin=0.724
AN=2.356 sinfast=0.707 sin=0.707
AN=2.380 sinfast=0.690 sin=0.689
AN=2.405 sinfast=0.672 sin=0.671
AN=2.429 sinfast=0.653 sin=0.653
AN=2.454 sinfast=0.634 sin=0.634
AN=2.478 sinfast=0.615 sin=0.615
AN=2.503 sinfast=0.596 sin=0.595
AN=2.527 sinfast=0.576 sin=0.575
AN=2.552 sinfast=0.555 sin=0.555
AN=2.577 sinfast=0.535 sin=0.535
AN=2.601 sinfast=0.514 sin=0.514
AN=2.626 sinfast=0.492 sin=0.492
AN=2.650 sinfast=0.471 sin=0.471
AN=2.675 sinfast=0.449 sin=0.449
AN=2.699 sinfast=0.427 sin=0.427
AN=2.724 sinfast=0.404 sin=0.405
AN=2.748 sinfast=0.382 sin=0.382
AN=2.773 sinfast=0.359 sin=0.359
AN=2.797 sinfast=0.336 sin=0.336
AN=2.822 sinfast=0.312 sin=0.313
AN=2.847 sinfast=0.289 sin=0.290
AN=2.871 sinfast=0.265 sin=0.266
AN=2.896 sinfast=0.241 sin=0.242
AN=2.920 sinfast=0.218 sin=0.219
AN=2.945 sinfast=0.194 sin=0.195
AN=2.969 sinfast=0.169 sin=0.170
AN=2.994 sinfast=0.145 sin=0.146
AN=3.018 sinfast=0.121 sin=0.122
AN=3.043 sinfast=0.097 sin=0.098
AN=3.067 sinfast=0.072 sin=0.073
AN=3.092 sinfast=0.048 sin=0.049
AN=3.117 sinfast=0.024 sin=0.024
AN=3.141 sinfast=0.000 sin=0.000
|
and executes on a PIC24 in 421 instructions, versus 3097 for the standard sin code.
It's based on fitting a second order polynomial to the sin curve, then allowing this to work for both + and - numbers (the fabs handles this), and then adding a third order polynomial correction. Because all the main terms are pre-solved, it is quite efficient. |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|