|
|
View previous topic :: View next topic |
Author |
Message |
Remco
Joined: 12 May 2015 Posts: 14
|
Strange misbehaving array |
Posted: Fri Jul 17, 2015 5:24 am |
|
|
Good day chaps,
I recently came across a strange problem when testing my software. I develop the firmware with CCS C compiler version 5.042. The firmware will run on a dsPIC30F6014A. The problem is as follows:
I declared a few one dimensional arrays to store different values for multiple calculations. At some point the value of a segment in the array is affected without any code that does so. aka the number in a array random changes. For an example I used TEMPERATURE_SETPOINT and test1. The declaration for the array is as follows:
Code: | #include<30F6014A.h>
#device ADC=12,PASS_STRINGS=IN_RAM,NESTED_INTERRUPTS=TRUE,CONST=READ_ONLY
#priority INT_RDA2,INT_TIMER1
#fuses HS,NOMCLR,PUT64,NOWDT
#use delay(clock=20M,CRYSTAL)
#use rs232(UART2,BAUD=9600,ERRORS,STREAM=UART_MASTER,ENABLE=PIN_B2,PARITY=N,BITS=8,STOP=1,TIMEOUT=3)//UART FOR the PC communication
#use rs232(UART1,BAUD=9600,ERRORS,STREAM=UART_SLAVE,PARITY=N,BITS=8,STOP=1,DISABLE_INTS,TIMEOUT=3)//UART FOR the SUBnet communication
#use rs232(STREAM=UART_DISPLAY,baud=19200,xmit=PIN_D12,rcv=PIN_D13,FORCE_SW,parity=N,bits=8,STOP=1,TIMEOUT=3,ERRORS,DISABLE_INTS)//UART FOR the Display communication
#use rs232(STREAM=UART_MF,baud=38400,xmit=PIN_A12,rcv=PIN_A13,FORCE_SW,PARITY=N,BITS=8,STOP=1,TIMEOUT=3,ERRORS,DISABLE_INTS)//UART FOR the Mass flow controller communication
//////////////////Global variables used FOR temperature/////////////////////
#define NUMBERTEMPERATURES 2
//0 = Temperature 1
//1 = Temperature 2
STATIC float32 TEMPERATURE_INTERNAL = 0.0;
STATIC float32 TEMPERATURE_DEFAULT_CORRETION[NUMBERTEMPERATURES] = 0.5873, 0.5873;//is in V, generated through software
STATIC float32 TEMPERATURE_SLOPE[NUMBERTEMPERATURES] = 0.0806, 0.0806; //generated through software
STATIC float32 TEMPERATURE_VALUE[NUMBERTEMPERATURES] = 0.0, 0.0; //Measured temperature
STATIC signed int16 TEMPERATURE_SETPOINT[NUMBERTEMPERATURES] = 330, 330; //30.0
static signed int16 test1 = 330;
STATIC signed int16 TEMPERATURE_MAX[NUMBERTEMPERATURES] = 500, 500; //Max temperature 50.0
STATIC signed int16 TEMPERATURE_MIN[NUMBERTEMPERATURES] = 0, 0; //Min temperature 0.0
STATIC signed int16 TEMPERATURE_USER_CORRECTION[NUMBERTEMPERATURES] = 0, 0; //is in C, set by user if nessacerry
STATIC signed int16 TEMPERATURE_INTERNAL_SETPOINT = 330; //33.0C
STATIC UNSIGNED int8 TEMPERATURE_AN_INPUT[NUMBERTEMPERATURES] = 10, 11; //Analog input for the two K-couples
STATIC UNSIGNED int8 TEMPERATURE_ON_OFF[NUMBERTEMPERATURES] = 0, 0; //Select if temperature controller must be on (1) or off(0) |
the place where I read out the array is as follow.
Code: | Place_floatvalue(115,119,TEMPERATURE_SETPOINT[0]/10);
Place_floatvalue(370,119,test1); |
No where in the code changes the values, so it wouldn't be affect. But when I go from a while loop back to main and again back to a different while loop the values changes. The values a represented on a TFT touchscreen display. When the firmware is running temperature setpoint is changed but not the test1. The only differences between TEMPERATURE SETPOINT and test1 is that the declaration (array or unsigned int16).
I also looked at the symbols, to check if the address of the variables are reused. But I couldn't find any reused variables so far. I have this problem with a few other arrays but I can't find any solution to let an array work.
I hope you guys can provide me with a solution or a way to check what's causing this problem.
in addition: the firmware is over 10000 lines big, so including this is not an direct option. Also I guess a simple test wouldn't do it because if did it, and it worked. That's why it is so strange.
Kind regards,
Remco |
|
|
asmboy
Joined: 20 Nov 2007 Posts: 2128 Location: albany ny
|
|
Posted: Fri Jul 17, 2015 6:06 am |
|
|
i would bet that what is is going on is that
the previous VAR in your program :
TEMPERATURE_VALUE []
is accidentally being written to with an index of 2
and thus flowing into the var you mention being altered.
accidental mis-indexing is the common cause of what you see
happening. |
|
|
Remco
Joined: 12 May 2015 Posts: 14
|
|
Posted: Fri Jul 17, 2015 6:40 am |
|
|
Quote: | accidental mis-indexing is the common cause of what you see
happening. |
Would this be a mismatch of the compiler/uc or the code itself that is written by me? |
|
|
asmboy
Joined: 20 Nov 2007 Posts: 2128 Location: albany ny
|
|
Posted: Fri Jul 17, 2015 7:01 am |
|
|
Quote: | code itself that is written |
often the result of a loop incremented indexing variable that goes out of range and allows a write to the wrong part of static memory.
Most of us have done it at one time or another.
you need to examine every /any instance of where you update any value in
"TEMPERATURE_VALUE [] "
and be dead certain you can't index other than 0 or 1
\as any greater value writes into the memory space you allocated for the var
TEMPERATURE_SETPOINT[]
ONE bad float value written to index[2] of TEMPERATURE_VALUE []will trash both vars of TEMPERATURE_SETPOINT[]
i'll bet there is some accidental write of that sort. |
|
|
Remco
Joined: 12 May 2015 Posts: 14
|
|
Posted: Fri Jul 17, 2015 7:17 am |
|
|
In every routine where I access the variables I check the index. If incorrect it goes out of the routine. After a test, the problem still occurs. I also outputted the index value and every time it is either 0 or 1. So it a strange problem.
The first time I came across this problem I automatically did what you said. Its my first thing to do because I know that would explain it. so another check gave the same result. |
|
|
alan
Joined: 12 Nov 2012 Posts: 357 Location: South Africa
|
|
Posted: Fri Jul 17, 2015 7:45 am |
|
|
To test what asmboy is hinting at. Put the declaration of test1 above the TEMPERATURE_SETPOINT. If it an out of bounds array then test1 will also be corrupted, or if it was the 1st element in the TEMPERATURE_SETPOINT array it will move to the 2nd.
Regards |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19548
|
|
Posted: Fri Jul 17, 2015 7:55 am |
|
|
1) Global variables will never be re-used.
2) Though it doesn't matter, the static declarations do nothing. The only difference between a global static, and a global, is that the global static is 'warranted' to be initialised to zero, if it not explicitly initialised.
3) Beware of things like pointer arithmetic. Though the most 'common' cause is something directly beneath or above the variable in memory 'walking' outside it's allowed indexes, the second most common reason is something like pointer arithmetic a long way away....
4) Are any of the variables used/updated in interrupts?. Beware that variables larger than a byte (for PIC16/18), or a word (For PIC24 & up), do not guarantee that writes will be 'uninterrupted'. You need to handle this yourself.
5) I've never found #priority to work properly on the PIC24/30. Instead explicitly allocate the individual interrupts to the interrupt level you want.
6) Personally I'd have a 'temperature' structure, containing the settings for a single channel, and then declare an array of these. Generally avoid using ALL CAPITALS. Normally this is used as an 'indicator' in C, that the value is a #define. Using it for variables removes this (useful) guidance.
So:
Code: |
struct temp_struct
{
float32 Internal;
float32 Default_Correction;
float32 Slope;
float32 Value;
signed int16 Setpoint;
signed int16 Max;
signed int16 Min;
signed int16 User_Correction;
unsigned int8 An_input;
unsigned int8 On_Off:1;
}
struct temp_struct Temperature[NUMBEROFTEMPERATURES] = {\
{0.0, 0.5873, 0.0806, 0.0, 330, 500, 0, 0, 10, FALSE }, \
{0.0, 0.5873, 0.0806, 0.0, 330, 500, 0, 0, 11, FALSE } \
};
|
Temperature[0].slope is then 0.0806 etc..
I'd start by looking very carefully at anything like buffer handling code, and any code with pointers. Beware if (for instance), something uses a 'long' pointer that is also accessed in an interrupt.
Is 'setpoint' _meant_ to not be an array?.
Updating this:
As a 'comment', the first thing to do, is rule things out. Will the rest of your code run _without_ this call to 'Place_floatvalue'?. If so, do this. I'd suspect you are going to find the damage still occurs. It then becomes a matter of 'narrowing'. Any other code you can temporarily do without, can be removed piece by piece. _Or_, add your own 'debug_test(n)' function, which when called compares TEMPERATURE_SETPOINT[0] with the value it was when the function was last called. If it's different, does something (triggers a pin, or records the value of 'n' somewhere). Then you can look and see which call saw this change, and look to see what code was called between this and the last call. |
|
|
Remco
Joined: 12 May 2015 Posts: 14
|
|
Posted: Mon Jul 20, 2015 1:30 am |
|
|
Thank you for your reaction. Following your statements:
1) I know that's the case, but I read it can happen. Most likely not but it can.
2) Always good to know if a variable is definitely initialized to a know number (to be sure).
3) I do not use any pointer arithmetic or pointers in general in this case. It would make sense if that was the problem but if there is none it couldn't be the problem.
4) I do not change any variables with which I have problems with in my interrupt routine. I only set a 'flag' to indicate that a code must be executed in the main program. So no variables a changed in the interrupt routine.
5) Most likely, but it is more like a trial to see if something happened (not seen yet).
6) I tried the structure but it gave the same problems. But an array is in this case for me more useful.
I already plough through the entire code before I made this post, because it's a problem. And to make sure it isn't a silly programming fault made by me.
The only thing I'm sure of is that it happens on a specific memory location. I searched through the ASM code to look if a reference it made without using the variable at that memory location. But there is none. So most likely is must be the pointer problem, but I cant find it because I specific don't use any pointer code. That's is why I find it so strange. And every indexing is checked to make sure a defined array is accessed. So its must be something very sully. |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19548
|
|
Posted: Mon Jul 20, 2015 2:38 am |
|
|
You are using pointers. In C, arrays and pointers are basically the same thing.
Classic 'hidden error' would be something in another array, that uses a variable for index, which accidentally gets taken 'below zero'. Becomes -1 (even on an unsigned), so if int8, 0xFF. This then results in talking effectively to entry 255, times the element size in the array, could be almost anywhere in memory. Ugh....
The structure is an array. hence the [0].
Instead of ten? separate arrays, it is a single array containing all the elements together as one entity. |
|
|
Remco
Joined: 12 May 2015 Posts: 14
|
[SOLVED] Strange misbehaving array |
Posted: Mon Jul 20, 2015 2:55 am |
|
|
I probably have found the problem. I was also using sprintf. Apparently the array at which the string must be stored was too small. And there, as you said, it was writing to the memory after his declaration. At some point the memory of the affected variables was overwritten. Increasing the array width solved it. It was a static declaration, so I would expect at least a warning to indicate this. But apparently sprintf doesn't do any checking, according to the manual. Bummer .
But it's true that C is full of pointers, and that you are using them even when you doing expect it. But I meant that I wasn't specific using pointer. Pointers that are made by programmers are most likely the fault which you meant. Pointers in code provided by de compiler or standard library's are most of the time proven so that would rule them out.
Thank you for you help.
-On to solving the next problems |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19548
|
|
Posted: Mon Jul 20, 2015 3:59 am |
|
|
At least you have found it...
It is key to understand that this is 'inherent' in C.
In C, and 'array' is just a pointer, with a compile time initialisation, and does not contain a 'size' as an element. In many other languages that have an 'array' type, they do. So things accessing a C array/pointer, have no knowledge inherently of how much space they actually point 'to', and can access beyond the end... :(
Now, in chips with hardware memory management, there is still no direct check, but if you access 'outside' the page allocated in the memory management, then a hardware warning can be generated. The PIC has no such system.
You could declare your own 'bounded' type, which you initialise with a pointer to an area of memory, and the bounds in this to be used, then overload all the string functions, with versions that test the address they are being called with, against the bounds. Problem is of course, a lot of code, and a lot of actual work.... |
|
|
jeremiah
Joined: 20 Jul 2010 Posts: 1357
|
|
Posted: Mon Jul 20, 2015 7:11 am |
|
|
This is one of the reasons I wish CCS had an snprintf() alternative. I made a quick and dirty one that I posted in the code library (for version 5 compilers), but at some point I need to go back and make myself a more efficient and proper one.
An snprintf() wouldn't completely prevent problems like this (you can put in whatever max size you want technically), but at least for me it would reduce the chances of this happening as I am pretty OCD about specifying lengths whenever I pass a pointer intended to be an array. |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19548
|
|
Posted: Mon Jul 20, 2015 7:23 am |
|
|
Agreed wholeheartedly.
Have you actually asked CCS?. Point out it's advantages, and it would not involve that much extra coding onto their existing sprintf library. |
|
|
jeremiah
Joined: 20 Jul 2010 Posts: 1357
|
|
Posted: Mon Jul 20, 2015 7:42 pm |
|
|
I have not. I guess I haven't gotten into the mindset that I can also send feature requests to the support email. I tend to only do bugs out of habit I guess. I'll have to toss them an email on this.
EDIT: sent one this morning. Also mentioned there may be other methods like strcpy (strncpy) that could benefit from this but that sprintf/snprintf would probably the more heavily used. We'll see what the response is. |
|
|
Remco
Joined: 12 May 2015 Posts: 14
|
|
Posted: Thu Jul 23, 2015 4:05 am |
|
|
We will see. It would make even a better compiler then it already is. |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|