Subscribers and publishers connection guarantees

asked 2020-05-15 13:16:49 -0500

azerila gravatar image

updated 2020-05-15 14:31:40 -0500

I know there are already many questions out there about subscriber connection quality but since what I am doing might still be rather within a special scenario I made it a new question for to be sure there isn't a particular detail that we are not aware of, and also to ask if in such scenario there is room for improvement.

we have made an application in which there are two topics:

   topic1:
      with publisher "p1" inside a python3 script, and subscriber "s1" inside a C++ script:
   topic2:
      with publisher "p2" in the same c++ script, and subscriber "s2" in the same python3 script from above.

call back function of "s1" in the end publishes "p2" and call back of "s2" in the end publishes with "p1". all with queue=1.

So a loop is made through this. that is "p1" publishes, "s1" gets the message does something and then end of its call back publishes with "p2" . Then, the "s2" that receives the message and again publishes with "p1" so this loop continues.

Basically, the c++ script is a ros_control hardware interface where its subscriber "s1" gets the robot state and makes an update by the controller after which it publishes the control commands by "p2". The c++ script has 4 AsyncSpinner. The python3 script is made of one node and has in total 2 subscribers, 1 publisher, 3 service service clients, 4 actions server, 1 action client and possibly 2 separate threads also that are connected to a simulation.

1_Our first question is if using python3 might cause any issue.

2_How much is it guaranteed that no message will be lost or if it stays synchronized.

3_Sometimes I see Inbound TCP/IP connection failed: warnings, what could be the problem.

4_is using something other than AsyncSpinner of 4 that may help better synchronization? like multithread spinner or anything else that helps.

5_I have noticed that at initialization, when I publish once the above loop doesn't start, unless I publish multiple times, then the above loop chain takes place and continues by its own. Also someitimes this loop stops, while I have an additional thread that tries to publish "p1" again to restart the loop chain, but it doesn't. Is there something I can do perhaps with wait for connection methods?

what may cause a subscriber and publisher's connection to be lost. In addition, I haven't used rospy.spin() since the script by its own have a while loop which does something and doesn't terminate,

The main problem I have: My codes in general work, however sometimes I see the velocity of my robot to be fast for a very short time which happens once in a while and I'm not sure what is causing i.

edit retag flag offensive close merge delete

Comments

General expectations is that each post has one primary question and that the question has sufficient background material provided to allow others a hope to answer. You've enumerated 5 questions and then added a "main" question about robot velocity at the end.

Answer 1 - there are many questions about Python already answered or not answered.

Answer 2 - if you're having synchronization issues -suggest you create new question with code and other diag info.

Answer 3 - Google "Inbound TCP/IP connection failed" or search ROS posts. It's been asked before.

Answer 4 - This is legitimate question but may have bee asked before. Have you searched?

Answer 5 - Start new question and post code and diag info - also do search as there are many answered questions dealing with timing of node start up and publishing. Also consider it sounds your code does this: https://en.wikipedia.org/wiki/Deadlock

billy gravatar image billy  ( 2020-05-16 15:10:14 -0500 )edit

@biilly

  1. didn't see any similar to my problem setting.
  2. the code is too large, any fraction of it wouldn't make sense.
  3. Yea but the other questions try it for example in virtual machine or WSL, which is not relater to my problem setting. I had a lot of search about it, and still didn't find anyone mention an origin for this problem.
  4. I have searched however, I thought my application structure (as described above regarding the loop) is perhaps too specific so that it may be clear from an expert what is right and wrong to use about it. Other questions have their own specific issues to solve which I have checked.
  5. starting new questions would prompt comments about asking multiple similar questions and putting the code is not possible.
azerila gravatar image azerila  ( 2020-05-18 13:21:14 -0500 )edit