CCS C Software and Maintenance Offers
FAQFAQ   FAQForum Help   FAQOfficial CCS Support   SearchSearch  RegisterRegister 

ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

CCS does not monitor this forum on a regular basis.

Please do not post bug reports on this forum. Send them to CCS Technical Support

Optimizing For Loop

 
Post new topic   Reply to topic    CCS Forum Index -> General CCS C Discussion
View previous topic :: View next topic  
Author Message
curt2go



Joined: 21 Nov 2003
Posts: 200

View user's profile Send private message

Optimizing For Loop
PostPosted: Mon Apr 23, 2018 2:18 pm     Reply with quote

I have a loop doing some addition. I am already using pointers to speed things up but wondering if there is a faster way to do this. This is on a 24EP256GP206 running at 140MHz.
Code:

unsigned int16 buffer_x[256];
unsigned int16 buffer_y[256];
int a;

unsigned int16 *bufferPointer_x;
unsigned int16 *bufferPointer_y;



bufferPointer_x = &buffer_x;
bufferPointer_y = &buffer_y;


for(a=0;a<256;a++){
    *bufferPointer_x = *bufferPointer_x + *bufferPointer_y;
    *bufferPointer_x++;//increment the pointer
    *bufferPointer_y++;
}
bufferPointer_x = &buffer_x;//start them pointing back to the proper place again.
bufferPointer_y = &buffer_y;
PCM programmer



Joined: 06 Sep 2003
Posts: 21708

View user's profile Send private message

PostPosted: Mon Apr 23, 2018 4:31 pm     Reply with quote

You're adding two arrays, element by element.

You can always trade code space for speed by unrolling the loop and
using fixed indexes. This gets rid of the loop overhead and the indirect
indexing and will be a lot faster:
Code:

buffer_x[0] += buffer_y[0];
buffer_x[1] += buffer_y[1];
buffer_x[2] += buffer_y[2];
buffer_x[3] += buffer_y[3];
.
.
.
.
buffer_x[255] += buffer_y[255];
curt2go



Joined: 21 Nov 2003
Posts: 200

View user's profile Send private message

PostPosted: Mon Apr 23, 2018 4:34 pm     Reply with quote

Yeh I have done that before. I have the space but I was thinking their might be a better way.. Smile

Edit. I'm not sure I have an option on the indexing. I fill and add to the buffer i'm not currently spitting out to the SPI.

The above example was only an example. I am actually using a pointer because I am switching the location of the buffer I am currently doing the adding into.

But it still might be faster so set a bit and still use the bunch of lines.

Thanks for the input. I will do a whole bunch of copy and pastes.
curt2go



Joined: 21 Nov 2003
Posts: 200

View user's profile Send private message

PostPosted: Mon Apr 23, 2018 5:19 pm     Reply with quote

I this case it takes 1% more ROM and is 5 times as fast. After looking at the .lst file. Smile
soonc



Joined: 03 Dec 2013
Posts: 215

View user's profile Send private message

PostPosted: Mon May 07, 2018 9:57 pm     Reply with quote

In your code example:
int a; in the for loop will never reach 256 for the loop be run forever... use int16 a;

Try this:


Code:

.................... void test()
.................... {
....................    int16 bx[256];
....................    int16 by[256];
....................    int16 i;
....................    for(i=0;i<256; i++)
*
0072A:  MOVLB  A
0072C:  CLRF   x71
0072E:  CLRF   x70
00730:  MOVF   x71,W
00732:  SUBLW  00
00734:  BNC   07B6
....................    {
....................       bx[i] += by[i];
00736:  BCF    3FD8.0
00738:  RLCF   x70,W
0073A:  MOVWF  02
0073C:  RLCF   x71,W
0073E:  MOVWF  03
00740:  MOVF   02,W
00742:  ADDLW  48
00744:  MOVWF  01
00746:  MOVLW  0A
00748:  ADDWFC 03,F
0074A:  MOVFF  01,A72
0074E:  MOVFF  03,A73
00752:  MOVFFL 03,3FEA
00758:  MOVFFL 01,3FE9
0075E:  MOVFFL 3FEC,A75
00764:  MOVF   3FED,F
00766:  MOVFFL 3FEF,A74
0076C:  BCF    3FD8.0
0076E:  RLCF   x70,W
00770:  MOVWF  02
00772:  RLCF   x71,W
00774:  MOVWF  03
00776:  MOVF   02,W
00778:  ADDLW  5C
0077A:  MOVWF  3FE9
0077C:  MOVLW  0A
0077E:  ADDWFC 03,W
00780:  MOVWF  3FEA
00782:  MOVFFL 3FEC,03
00788:  MOVF   3FED,F
0078A:  MOVF   3FEF,W
0078C:  ADDWF  x74,W
0078E:  MOVWF  01
00790:  MOVF   x75,W
00792:  ADDWFC 03,F
00794:  MOVFFL A73,3FEA
0079A:  MOVFFL A72,3FE9
007A0:  MOVFFL 03,3FEC
007A6:  MOVF   3FED,F
007A8:  MOVFFL 01,3FEF
007AE:  INCF   x70,F
007B0:  BTFSC  3FD8.2
007B2:  INCF   x71,F
007B4:  BRA    0730
....................    }
007B6:  MOVLB  0
007B8:  GOTO   63E6 (RETURN)
....................     
....................     
.................... }
Ttelmah



Joined: 11 Mar 2010
Posts: 19568

View user's profile Send private message

PostPosted: Mon May 07, 2018 11:23 pm     Reply with quote

Actually it will Soonc.

This is a DsPIC. On these the default 'int' is a signed int16, so it will work fine.
This though is why you should always use explicit sizes.
pmuldoon



Joined: 26 Sep 2003
Posts: 218
Location: Northern Indiana

View user's profile Send private message

PostPosted: Tue May 08, 2018 5:26 am     Reply with quote

what about a compromise.
unwrap the loop to do 8 updates per iteration and increment by 8. and rewrite as two loops with absolute addressing and decide which loop to run when the function is called.

That should be faster and still easy to read and follow the code.

Just thinking...
pmuldoon



Joined: 26 Sep 2003
Posts: 218
Location: Northern Indiana

View user's profile Send private message

PostPosted: Tue May 08, 2018 5:43 am     Reply with quote

Couldn't you take advantage of the #INLINE directive and write a function called in a loop and let the compiler do the tedious work of unwrapping it?

And here's a tricky one. Is there a way to tell the compiler that the address you're referencing is really fixed (constant) even tho you're incrementing to derive it in the pre-compile stage?

I've never taken advantage of many of the things the compiler can do for me, but this problem has gotten me thinking...
temtronic



Joined: 01 Jul 2010
Posts: 9255
Location: Greensville,Ontario

View user's profile Send private message

PostPosted: Tue May 08, 2018 6:07 am     Reply with quote

faster ? yeesh poor little PIC's running at 140meg ! I remmeber Z80's doing TWO meg and thought wow..sigh, guess I'm old...
It would be interesting to hear how fast(the actual time) this loop takes though.
Silly Q. Can you overclock the PIC? It's a 'cheat' but might work.
Jay
Ttelmah



Joined: 11 Mar 2010
Posts: 19568

View user's profile Send private message

PostPosted: Tue May 08, 2018 8:40 am     Reply with quote

Problem is that if a variable is involved the compiler has to assume it is 'variable' it can only assume constants when everything is constant.
Probably the easiest way to code it is to take advantage of macros:
Code:

#define BUFF_ADD(n) bufferPointer_x[n]+=bufferPointer_y[n]

    BUFF_ADD(0);
    BUFF_ADD(1);
    BUFF_ADD(2);
    BUFF_ADD(3);.....


I was actually trying to work out if this could be done using DMA. With the CLC block. However though you could do things like AND or OR with this I don't think you could perform addition....

However in this case the really efficient way is to use assembler:
Code:

   int16 ctr;

   ctr=256;
#ASM
   MOV buffer_x, W0
   MOV buffer_y, W2
//   REPEAT 256 *
loop:
   MOV [W0], W1
   ADD W1, [W2++], [W0++]
   DEC ctr
   BRA NZ, loop
#ENDASM


Will 'run rings' round any other solution I think.
Display posts from previous:   
Post new topic   Reply to topic    CCS Forum Index -> General CCS C Discussion All times are GMT - 6 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group