ADDRESS ERROR TRAP INTERRUPT PIC24H

benoitstjean · Posted: Fri Dec 05, 2014 3:47 pm

Actually, just in case you haven't noticed, the value retrieved by the address error trap code I posted earlier retrieves the value 0x00200000 (5 zero's) not 0x00020000 (4 zero's) as you stated... Not sure if it changes anything.

Also, if I look at the Microchip specs, that address (200K HEX) falls within the User Program Flash Memory as stated in the Microchip documentation page 25 (http://ww1.microchip.com/downloads/en/DeviceDoc/70175H.pdf) in the right-hand column. The first address of this block is 0x00000200 and the last address of this block is 0x007FFFFE so 0x00200000 is right in there....

Just wanted to clarify the values here in case it changes anything... So should I still go with the 38 instead of 36?

Thanks again,

Benoit

Ttelmah · Joined: 11 Mar 2010 Posts: 19540

Yes.

You _must_ go to 38.

You need to understand what happens.

When an interrupt occurs, the PIC automatically saves the address where the code currently 'is', and the status register onto the stack.
Then the interrupt handler saves what registers it needs.
This leaves the stack at this point with loads of values stored.
We need to retrieve the address stored on the stack.
Where this is (relative to the current stack pointer), depends on how many registers the interrupt handler stores.

CCS changed how much data they save between when the code was posted, and the current compilers. I looked at the assembler generated by the current compilers, and 'counted back' to where the address was stored, and the result for the current compiler, is 38, not 36.

I've since tested this by forcing an error interrupt, by just loading a pointer to a 16bit variable, incrementing it by one, and then retrieving a 16bit variable from this. This forces a 16bit access to an 'odd' address, which will give an address error interrupt.
With 38, merrily works, and retrieves the correct address where it happens.

The 0x20000 value is basically nothing as far as your problem lies.

benoitstjean · Posted: Tue Dec 16, 2014 12:50 pm

All right, so it **just** occured again (first time in two weeks) and I changed the value from 36 to 38 (see earlier posts) and the #ADD_INT error gave me address 0x07C62.

Here's the C listing followed by that same function with its assembly breakdown:

Ttelmah · Joined: 11 Mar 2010 Posts: 19540

It's the move in front of this. The return address is the instruction _after_ the problem.
One or the other of the addresses is causing a problem.
I'd suspect packetByteIndex is getting set somewhere it shouldn't. Add a test line in front of the function, to verify what this is set to.

benoitstjean · Posted: Tue Dec 16, 2014 8:20 pm

Hi Ttelmah,

Thanks for your reply.

So when you say "One or the other of the addresses is causing a problem", which addresses are you referring to? The actual addresses 07C62 and 07C60?

I don't know assembly and I don't have the slightest idea what the code is doing (hence this post).

I'm not sure how I'll address this problem because it'll be _very_ difficult to print anything for troubleshooting because the DMA packet is filled 63 times per second with 127 bytes of data and it just goes non-stop. This problem can occur twice in two minutes just like it can happen twice at 20 minutes apart. Just this past week-end, I had this code running for 3 days straight. Today, it started acting-up. I haven't touched this part of the code so I'm thinking that it could be some race or timing condition that occurs very seldomly.... hard to tell...

I'm at home now and don't have my code in front of me. But if I remember correctly, I believe that just a few lines prior to this loop, the <i> variable is set to 0 and the <packetByteIndex> variable is set to 0 then increased based on other values.

Let me investigate this further tomorrow morning. In the meatime if you think of anything else or have other suggestions, you are more than welcome.

Thanks again,

Benoit

Ttelmah · Joined: 11 Mar 2010 Posts: 19540

Neither.

The instruction causing the problem, is the byte move:

Remco · Joined: 12 May 2015 Posts: 14

Hi, it's me again.

I'm again facing a problem. This time it's with an address error when I'm reading out a software UART with a hardware UART running on interrupt. Time to time the uc reset himself. I know which function causing the reset, but I want to know where exactly it reset, because maybe its still somewhere else causing the reset.
I know it's a address error and therefore I want to use the code provided in this thread. But when the error occur and put it out to an UART to a PC (it's working correctly) I get a strange number. A number which can never be find in the list file because its too big like 0x2AC600. Is there a way that they again changed how many register CCS saves for a routine enters the interrupt handler? Like Ttelmah posted in this thread on Fri Dec 05, 2014?
I'm using Version 5.042 of the compiler, that came recently out.

Thanks in advance,

Remco

Ttelmah · Joined: 11 Mar 2010 Posts: 19540

OK. Done a quick test with 5.048.
The code works with 0x38 offset.

You need:

Remco · Joined: 12 May 2015 Posts: 14

Yup you are wright. Did everything as it was writing. Get exactly the address at which it must go bad. It was also exactly the same as I had before I made the post. So thank you for your quick test, now I must find out what the meaning is from the value I get back every time.

I gave it a new try and it gave me back 0x440AB9. It differs from time to time. Accordantly the data sheet this is in a reserved block of the program space memory of the dsPIC30F6014A. And if I remember it correctly all the other values would be in the same block as well. Also the number can't be found in the list file.

Ttelmah · Joined: 11 Mar 2010 Posts: 19540

I still have a 'sneaky' that something is wrong in what you are doing.

Make an error the way I show, and see if you still get the variable reply. This would 'prove' whether the code is actually working.

It could be something silly, like having traperror declared as a local variable in the interrupt code, and also as a global, so the interrupt code writes to it's local copy, not the global, and then the print routine prints out the global version, so the value has nothing to do with the actual error!...

Remco · Joined: 12 May 2015 Posts: 14

I wrote the code exactly as you suggested. And with a touch on a button your suggested trap code will be executed. When the button is pressed I get the value thru UART that is connected to a PC. The value I get from your code is indeed the location at witch it must go bad. But the code is still the same to test the other trap code which I want to find out at where it goes wrong (as written in my previous post).

With this I can prove the code works and that the value I get back is genuine. Otherwise your code would fail to. This is why I find it so a odd value.

Ttelmah · Joined: 11 Mar 2010 Posts: 19540

I'd suspect the fundamental fault is not actually an address error.

Suspect something like an extra 'return' being reached, which pops a data value off the stack, so results in the code going off into a 'random' address. Then the bytes that just happen to be 'seen' by the processor at this address, are some artifact of the chip, that results in an instruction being executed that gives the trap....

You really are going to have to narrow things down. Have a marker byte that is changed as the code reaches particular parts of the code. When the trap triggers see what is in the marker. Then put more 'tighter packed' markers between this one and the next.

Alternatively, add a 'tick' interrupt (every mSec say), that again records the calling address the same way as the trap code. When the trap triggers look at the address this has recorded. You then know 'where' the code was, no more than 1mSec 'before' the error.

You do have the stack size expanded?. Though this should give a stack error, not an address error, it is worth ruling this out.

younder · Joined: 24 Jan 2013 Posts: 53 Location: Brazil

Ttelmah · Joined: 11 Mar 2010 Posts: 19540

Putting a printf, into the interrupt is going to change far too many things to actually be useful. Currently, the compiler can't fit it in the segment where the interrupt handlers normally sit.

Declare trapaddr without initialisation.
Have the address trap routine, trigger a processor reset, after loading trapdaddr.
Stick code at the start of your main, to test 'restart_cause', and print the contents of trapaddr, if the cause is 'RESTART_SOFTWARE'.

256, is not large. If I'm doing a lot of things I need 384 or 512. The stack is used for variables on the PIC24/30, so gets a lot more in it....

younder · Joined: 24 Jan 2013 Posts: 53 Location: Brazil