CCS C Software and Maintenance Offers
FAQFAQ   FAQForum Help   FAQOfficial CCS Support   SearchSearch  RegisterRegister 

ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

CCS does not monitor this forum on a regular basis.

Please do not post bug reports on this forum. Send them to CCS Technical Support

USB loses connection randomly (PIC18F USB CDC Virtual COM)

 
Post new topic   Reply to topic    CCS Forum Index -> General CCS C Discussion
View previous topic :: View next topic  
Author Message
jameshdx80



Joined: 17 Mar 2017
Posts: 8

View user's profile Send private message

USB loses connection randomly (PIC18F USB CDC Virtual COM)
PostPosted: Fri Mar 17, 2017 1:01 pm     Reply with quote

Hi,

I have a project that implements a Virtual COM (USB CDC) with a PIC18F4550. The project consists of reading a servo drive data and sending it to the PC through COM Port. Everything was working fine for a month. Then, I did a few changes in the firmware and PIC18F started losing USB connection. The PIC seems to disconnect and reconnect again. (I usually hear the sound of windows for both events). The problem is that is problems occurs randomly and may take up to 30 minutes to happen. Therefore, it has been very hard do debug with a logic analyzer or an in circuit debugging tool.

I started reading the CCS USB documentation and CCS forum and I found this:

// Defining USB_ISR_POLLING will have USB library not use ISRs. Instead you
// must periodically call usb_task().

Since I am not defining USB_ISR_POLLING, I thought I would not need to call usb_task(), but then I read this at a post ( https://www.ccsinfo.com/forum/viewtopic.php?t=55350 )

Ttelmah wrote:
It's fractionally more complex than they make clear....

If you #define USR_ISR_POLLING, it removes the definitions to say that the interrupt handler code is actually an ISR. The code is instead called from usb_task, and this must then happen at a high rate. The faster than 1mSec rate is needed if your device uses 'interrupt transfers', since these _require_ replies within 1mSec. Even on other devices the call must be very frequent.

Without USB_ISR_POLLING defined, you still need to call usb_task at a reasonable interval (I'd say about 0.25 second or less). Usb_task no longer handles the actual USB transactions, but does handle some 'housekeeping', if the bus is connected, released etc.. Hence the problems you have if it is not called.


My questions are:
1) I just call usb_task() at the initialization and it was working perfectly. How is this possible?
2) Not calling usb_task() frequently could cause the USB disconnecting and reconnecting eventually?

Thanks in advance.
JamesHDX
Ttelmah



Joined: 11 Mar 2010
Posts: 19576

View user's profile Send private message

PostPosted: Fri Mar 17, 2017 1:26 pm     Reply with quote

There are lots of 'extras' handled by usb_task.

For instance, if you connect to a USB3 port, this will disconnect and reconnect on boot. Task is what can handle this, so has to be called multiple times, or the system will not connect properly.

USB can run on a standard port almost indefinitely without calling task, but if Windows triggers a selective suspend, again task needs to be called to re-connect correctly. So WinXP will run on a USB2 port without task, but Win Vista SP2 and later generally won't....

Is it possible that there has been a Windows driver update?.
Can you go back to the old code?.

If you can test the old code and it works, then your changes have caused the problem. However if it too has problems, then a Windows change is now showing the limitations of your code....

Things that could cause problems:
Interrupts being disabled for any significant time.
Array overflows.
Task, in the circumstances outlined.
Data not being read, so the USB buffer overflows (will stall the USB).
asmboy



Joined: 20 Nov 2007
Posts: 2128
Location: albany ny

View user's profile Send private message AIM Address

PostPosted: Fri Mar 17, 2017 6:06 pm     Reply with quote

I know your pain and have been there more than once.

I think i learned about this from PCM_programmer
and many times when comparing a program version that was working
with a revision that was not -
it has been a huge time saver.

http://www.prestosoft.com/edp_examdiff.asp
Ttelmah



Joined: 11 Mar 2010
Posts: 19576

View user's profile Send private message

PostPosted: Sat Mar 18, 2017 2:34 am     Reply with quote

This is where keeping a 'trail' is vital.

Any time I have code that is running, the source directory, is copied and renamed. So 'IOboard25112thmar2017', is the code for a current board I'm working on, saved on 12th March 2017. The board has an internal to the company 'code number' 251.
When developing, if I make a major change, another copy is made.
In the code itself, as part of the comments, I also record what compiler version is being used.
Each compile auto-increments a version number.
These are all archived (together with the compiler).

At any point in the future, I can go back to the version that was being used on a particular date.

Examdiff, is a really useful tool, if you change compiler version, and then code doesn't work. Allowing you to quickly find what the compiler has changed, and either alter your code, or switch back to the previous compiler (and add a note to the source, saying what changed....).

Techniques like this become more and more important, the larger the code, and the more people are involved in the code, or things connecting to the code.

The classic 'waste of time', is where you make a change, and something goes wrong. You don't have the old code to test, so 'assume' it was your change, not realising that something else changed at the same moment....
Days can be wasted chasing this type of problem, while if you can go 'back', you can find in a few minutes that the problem doesn't go away, and then start looking for what else changed.
jameshdx80



Joined: 17 Mar 2017
Posts: 8

View user's profile Send private message

PostPosted: Mon Mar 20, 2017 10:36 am     Reply with quote

I would like to thank Ttelmah and asmboy for their help. Probably I did not let things clear about how to recover my previous source code versions. Fortunately, I use a Version Control Software (SVN) and WinMerge, therefore I could keep track of all my changes since the bug started to happen. I really recommend everyone to follow Ttelmah advices with backup versions or Version Control Softwares and also to follow asmboy with ExamDiff (or its similar WinMerge).

As for my problem, I checked the previous versions and I noted I have moved usb_task() from an interrupt handler (Timer0) to the initialization of the firmware which is called only once. Therefore, I believe since I am not calling usb_task() to do something like a clean up, the USB loses connection after 20 to 30 minutes. I will test and report here later.

Ttelmah wrote:

Things that could cause problems:
Interrupts being disabled for any significant time.
Array overflows.
Task, in the circumstances outlined.
Data not being read, so the USB buffer overflows (will stall the USB).


I check the interrupts and array overflows and that seems to be OK. But "Data not being read" impressed me. This really may happen since the PC may be faster than PIC. My new questions are:

1. Is there a way to flush the buffer reception or transmission buffers?
2. Is there any problem to call usb_task() inside an interrupt handler?

JamesHDX
Ttelmah



Joined: 11 Mar 2010
Posts: 19576

View user's profile Send private message

PostPosted: Mon Mar 20, 2017 11:53 am     Reply with quote

usb_cdc_get_discard();
jameshdx80



Joined: 17 Mar 2017
Posts: 8

View user's profile Send private message

PostPosted: Tue Mar 21, 2017 2:46 pm     Reply with quote

Ttelmah wrote:
usb_cdc_get_discard();

Thanks!

After spending a couple of hours, I noticed that the problem is not that USB disconnects but rather PIC resets... :( which seems worse to me.

Now I have an unending list of things to check:
- Hardware
MCLR PIN, WDT configuration, capacitors, pull up resistors, and so on

- Software
interrupt handlers, USB buffers, memory leaks...

At the moment I am trying to put a breakpoint on main() and I will check RCON value after reset.
asmboy



Joined: 20 Nov 2007
Posts: 2128
Location: albany ny

View user's profile Send private message AIM Address

PostPosted: Tue Mar 21, 2017 3:13 pm     Reply with quote

I have NEVER used PIC VCOM features.
they are too program memory intensive
and use too much PIC "instruction resource"
for the sort of work i do.

I have ALWAYS used FTDI FT232 chips and
the PIC serial TTL pins for my VCOM functions
and never had a problem.

PIC USB support is a snake pit and i avoid it at all costs.

There are many inexpensive adapters with an FTDI chip
and the few outboard parts it needs - widely available and cheap.
jameshdx80



Joined: 17 Mar 2017
Posts: 8

View user's profile Send private message

PostPosted: Tue Mar 21, 2017 3:29 pm     Reply with quote

asmboy wrote:
I have NEVER used PIC VCOM features.
they are too program memory intensive
and use too much PIC "instruction resource"
for the sort of work i do.

I have ALWAYS used FTDI FT232 chips and
the PIC serial TTL pins for my VCOM functions
and never had a problem.

PIC USB support is a snake pit and i avoid it at all costs.

there are many inexpensive adapters with an FTDI chip
and the few outboard parts it needs - widely available and cheap.


I could not agree with you more. I have used FT232RL + HCS12 and it has been working for 8 years without problem
Ttelmah



Joined: 11 Mar 2010
Posts: 19576

View user's profile Send private message

PostPosted: Wed Mar 22, 2017 2:03 am     Reply with quote

The full reset though would be a problem however you implement the device...

First thing to check is what does the listing show for stack use?.
The compiler should warn if the stack is potentially overflowing, but worth checking.
Are there any warnings being displayed by the compile?. There will normally be one on a couple of the usb functions (interrupts disable to prevent redundancy), which is because CCS call a couple of register setting routines both inside an interrupt and outside, but these are small and the 'disabled' bits are only called during setup. Any others?.
What did the code you change do?. Switching power, operating other devices?. Things like this can reset chips.
jameshdx80



Joined: 17 Mar 2017
Posts: 8

View user's profile Send private message

PostPosted: Wed Mar 22, 2017 12:54 pm     Reply with quote

Ttelmah wrote:
The full reset though would be a problem however you implement the device...

First thing to check is what does the listing show for stack use?.
The compiler should warn if the stack is potentially overflowing, but worth checking.
Are there any warnings being displayed by the compile?. There will normally be one on a couple of the usb functions (interrupts disable to prevent redundancy), which is because CCS call a couple of register setting routines both inside an interrupt and outside, but these are small and the 'disabled' bits are only called during setup. Any others?.
What did the code you change do?. Switching power, operating other devices?. Things like this can reset chips.


I am impressed by your post. After reading it carrefully, I gather some information and did some tests, here are my conclusions:

1. Stack
This is from the listings.

ROM used: 9226 bytes (29%)
Largest free fragment is 22710
RAM used: 548 (27%) at main() level
610 (30%) worst case
Stack: 17 worst case (8 in main + 9 for interrupts)

To me, it seems OK since the previous versions had almost the same data.

2. Compiler Warnings

>>> Warning 216 "mainFile.c" Line 441(181,182): Interrupts disabled during call to prevent re-entrancy: (usb_token_reset)
>>> Warning 216 "mainFile.c" Line 441(181,182): Interrupts disabled during call to prevent re-entrancy: (usb_cdc_get_discard)
>>> Warning 216 "mainFile.c" Line 441(181,182): Interrupts disabled during call to prevent re-entrancy: (usb_tbe)
>>> Warning 216 "mainFile.c" Line 441(181,182): Interrupts disabled during call to prevent re-entrancy: (usb_cdc_flush_out_buffer)

This message has always appeared, but maybe the error has something to do with it. I google it and many people had codes with delay instructions inside interrupts that caused this messages to appear. That is not my case.


3. Interrupts (!)
I think this may be the problem. I have status of two previous versions
Firmware Version 1.0: PIC never resets (I use INT_EXT, USB, TIMER2 - every 100 us)
Firmware Version 1.1: PIC resets rarely I say once (I use INT_EXT, USB, TIMER2 - every 100 us, TIMER0 - every 500us)
Firmware Version 1.2: PIC resets at every 4 minute (I use INT_EXT, USB, TIMER2 - every 100 us, TIMER0 - every 500us and TIMER3 - every 500us)

Well, now it seems clear to me that Timers may interfer in USB communication. I thought that the priority was USB which would disable all other interrupts. Therefore, I did not care to much about using to many interrupts. Although I was carefull to not take more than 1 to 10 us in each interrupt. Besides I found this post
http://www.microchip.com/forums/m603499.aspx - Problem with 18F14K50; maybe timer interrupts disturb USB communication?!

I will try to rewrite my firmware by not using TIMER3 and test again.

If anyone who have any hits about to many timer interrupts + USB = Resets, please let me know.
Ttelmah



Joined: 11 Mar 2010
Posts: 19576

View user's profile Send private message

PostPosted: Wed Mar 22, 2017 3:50 pm     Reply with quote

As I said there are a few re-entrancy warnings from the standard code. These are not a problem. Basically the extra calls (with the interrupts disabled), are all before the USB is actually started.

I have to say, 'why on earth' have three timer interrupts?.

If you have timer1 interrupting every 100uSec, then just add this:
Code:

//in the interrupt
   static int8 tick=4;
 
   //then
   if (tick)
      --tick;
   else
   {
      tick=4;
      //then do your 500uSec routine

   }


You have to understand just how much overhead is involved in handling an interrupt. A counter like this involves just two instructions added to the 100uSec interrupt. Adding another interrupt is perhaps 50+ instructions for each interrupt, and while you are in these, the other interrupts can't be handled. Ugh....
You certainly should never want two interrupts at the same interval. That is insane...

How fast are you actually clocking the chip?.
The easy way to reduce how long it takes to handle an interrupt, is to increase the clock rate. Show your setups.
You need a minimum clock rate of about 5MHz, with no other interrupts being called, just to handle the USB housekeeping. With interrupts being called, this cascades.
jameshdx80



Joined: 17 Mar 2017
Posts: 8

View user's profile Send private message

PostPosted: Tue Mar 28, 2017 8:14 am     Reply with quote

Thanks Ttelmah! I have reimplemented and tested the new firmware without using Timer3 and problem seems to be solved. Although I just call usb_task() once at initialization. Besides, the idea of having three interrupts was to emulated different tasks. I have done this in other uC and had never had problems. But for sure PIC18F4550 this idea did not work using 48MHz as clock.

Thanks again Ttelmah! and Asmboy for his suggestions.

Regards,
JamesHDX
Ttelmah



Joined: 11 Mar 2010
Posts: 19576

View user's profile Send private message

PostPosted: Tue Mar 28, 2017 8:59 am     Reply with quote

You'll get a problem if you connect to a USB3 port, or in some Windows 'sleep' states, unless task is occasionally called....
Display posts from previous:   
Post new topic   Reply to topic    CCS Forum Index -> General CCS C Discussion All times are GMT - 6 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group