Resolving frequent disconnects & reconnects

Hi all,
I’ve got my project working with Blynk after some amount of effort. However I’m currently having frequent disconnect/reconnect issues which I’m not sure how best to resolve.

I’ve built some test code just with Blynk only which runs without this particular problem which I discussed here. But it seems I’m hitting some transient condition with the rest of my main project code once Blynk is introduced there.

It’s quite long (500+ lines) so I’ll spare you all the gory details but in summary it does the following;

//#define BLYNK_DEBUG            // Optional, this enables lots of prints
#define BLYNK_PRINT Serial
#define BLYNK_HEARTBEAT 30       // Default is 10s
#define BLYNK_TIMEOUT_MS 10000UL // ms unsigned long

#include <ESP8266_Lib.h>
#include <BlynkSimpleStream.h>  // Blynk libraries
#include <WiFiEsp.h>            // Include support for ESP8266-01 WiFi module

....

const int updateBlynkInterval = (15 * 1000);       // x second interval at which to update Blynk
unsigned long lastBlynkConnectionTime = 0;        // Variable Setup
....

void setup() {

....

  connectWifi();                // Connect to Wifi using WiFiEsp.h
  printWifiData();              // Print Wifi IP/MAC info
  printCurrentNet();            // Print Wifi Signal details info
  delay(1000);

  connectBlynk();               // Init Blynk connection
  Blynk.begin(client, auth);    // Init Blynk
  delay(10);
  Serial.println(F("Blynk connected!"));
  .....
}


void loop() {                                     //  Main execution loop

    if (millis() - runtime_1 >= 3000UL) {         //  
      getLocalTempHumid();                        //  Get local sensor data
      displayLocalTempHumid();                    //  Display local data on screen
      runtime_1 = millis();      
      if(test) { Serial.println(runtime_1);}      //  Scheduler debug
      }
    
    if (millis() - runtime_2 >= 5000UL) {
      displayRemoteTempHumid();                   //  Display remote data on screen
      runtime_2 = millis();  
      if(test) { Serial.println(runtime_2);}      //  Scheduler debug
      }
      
    rf_Receive();                                 //  Receive remote station data over RF link
      
    if (millis() - runtime_3 >= 7000UL) {
      displayClock();                             //  Display clock data on screen
      runtime_3 = millis();      
      if(test) { Serial.println(runtime_3);}      //  Scheduler debug
      }
        
    Blynk.run();   
    updateBlynk();                                // Sends data to Blynk cloud every updateBlynkInterval period
    updateThingSpeak();                           // Upload data to ThingSpeak

    if (status != WL_CONNECTED) {                 // Reconnect WiFi if disconnected
      connectWifi();
    return;
    }
                              
}  
..... <lots of code here for functions above>

bool connectBlynk() {                         // This function tries to connect to the cloud using TCP
  if(test) { Serial.println("connectBlynk()");}
  client.stop();                              // Dont use client.flush() here, causes likely problems
  return client.connect(BLYNK_DEFAULT_DOMAIN, BLYNK_DEFAULT_PORT);
}

void updateBlynk() {                        // Sends data to Blynk cloud every updateBlynkInterval period
    if(millis() - lastBlynkConnectionTime > updateBlynkInterval) {
      if(!Blynk.connected()) {
        connectBlynk();
        Blynk.connect();
        }     
      Blynk.virtualWrite(V0, millis() / 1000);
      Blynk.virtualWrite(V1, localtemp);
      Blynk.virtualWrite(V2, localhumid);
      Blynk.virtualWrite(V3, bmeData.temp);
      Blynk.virtualWrite(V4, bmeData.hum);
      Blynk.virtualWrite(V5, bmeData.pres);
      Blynk.virtualWrite(V6, bmeData.volts);
      Blynk.virtualWrite(V7, bmeData.pkt_counter);
      lastBlynkConnectionTime = millis();
      if(test) { Serial.println("Data sent to Blynk");}
      }
}

Debug log shows following,

17:30:50.892 -> [WiFiEsp] Connecting to blynk-cloud.com
17:30:51.031 -> [107843] 
17:30:51.066 ->     ___  __          __
17:30:51.099 ->    / _ )/ /_ _____  / /__
17:30:51.099 ->   / _  / / // / _ \/  '_/
17:30:51.133 ->  /____/_/\_, /_//_/_/\_\
17:30:51.171 ->         /___/ v0.6.1 on Arduino
17:30:51.206 -> 
17:30:51.206 -> [107934] Connecting...
17:30:51.490 -> [108276] Ready (ping: 38ms).
17:30:51.923 -> Blynk connected!
17:30:53.928 -> Run..
17:31:54.512 -> [171181] Heartbeat timeout
17:31:54.584 -> [171261] Connecting...
17:32:04.939 -> [181620] Login timeout
17:32:05.048 -> [181701] Connecting...
17:32:06.063 -> [WiFiEsp] Disconnecting  3
17:32:06.098 -> [WiFiEsp] Connecting to blynk-cloud.com
17:32:10.486 -> [187162] Connecting...
17:32:10.804 -> [187474] Ready (ping: 38ms).
17:33:11.626 -> [248195] Heartbeat timeout
17:33:11.728 -> [248276] Connecting...
17:33:22.020 -> [258548] Login timeout
17:33:22.093 -> [258630] Connecting...
17:33:22.447 -> [WiFiEsp] Disconnecting  3
17:33:22.485 -> [WiFiEsp] Connecting to blynk-cloud.com
17:33:27.425 -> [263990] Connecting...
17:33:27.746 -> [264303] Ready (ping: 38ms).
17:34:28.415 -> [324873] Heartbeat timeout

I’ve increased the heartbeat and timeout values as you can see from default, but I don’t think this is the right way to go and it will only go so far. I have a very fast 100Mbps fiber link, so I don’t think it’s the link latency and none of the code is blocking to reach the Blynk 10s timeout. As you can see it happens about every minute or so, but the data does get uploaded correctly. I’m also uploading the same data to Thingspeak which works fine.

My understanding is that Blynk.run() in the main loop should manage the connection to blynk-cloud.com once established, but I’m clearly missing something here. What woudl cause the heartbeat timeout to be reached, any thoughts?

Any help much appreciated, thanks.

It’s very difficult to try to understand what is happening when you only share part of your code and miss-out key pieces of information.
I assume this is still running on an Uno?
There’s no SoftwareSerial port useage in your code, but is this an intentional omission fgrom your snipped code, or is something else going-off? - I can’t tell from the data in front of me.

We have to take your word for that, but calling routines like:

every time your void loop executes isn’t good Blynk practice and without seeing what’s in that function it’s difficult to know if its an issue. It also raises the question about how this RF data is getting to your [? Uno ?] board and whether its causing a serial conflict with whichever port is being used to communicate with the ESP-01.

Your first step needs to be to move as much of this stuff out of the void loop as possible and use BlynkTimer rather than the if(millis.... stuff that you’re doing at the moment.
Take a read of this:
http://help.blynk.cc/getting-started-library-auth-token-code-examples/blynk-basics/keep-your-void-loop-clean

Pete.

Hi Pete,

Thanks again for your response here. :grinning:

Thinking about it some more, I think you’re right, even though I don’t have any use of delay() outside of setup(), just double checked, and have moved over to the millis based scheduler technique listed, something else must eating up my loop() cycle time I suspect, I will try debug some more. As a general best practice, how often does Blynk.run() need to be called, would say once even second be sufficient to avoid heartbeat timeouts etc?

Below is the rf_Receive() for reference, it should be non-blocking if I understand correctly, it uses the RadioHead library here. It’s receiving data asynchronously across a 433MHz RF link from my remote sensor.

void rf_Receive() {
  //if (test) {
  //    Serial.println(F("rf_Receive Exec"));
  //  }
    digitalWrite(LED_BUILTIN, LOW);              // RF Receive and flash LED
    uint8_t rdata_buf[RH_ASK_MAX_MESSAGE_LEN];    // Set expected message size (23) or RH_ASK_MAX_MESSAGE_LEN which seems to work as well
    uint8_t buflen = sizeof(rdata_buf);           // Check if received packet is correct size
    
    if (rf_driver.recv(rdata_buf, &buflen))       // Non-blocking
    {
      int i;                                      // Message received with valid checksum
      digitalWrite(LED_BUILTIN, HIGH);            // RF Receive and flash LED
      if(test) {
        Serial.println(F("RF Receiving..."));
      }
  
//    rf_driver.printBuffer("Got:", rdata_buf, buflen); // Message with a good checksum received, dump it.
      memcpy(&bmeData, rdata_buf, sizeof(bmeData));
      digitalWrite(LED_BUILTIN, LOW);
        
    if(test) {
      printValues();        // Print debugging if serial connected and test is turned true
 
      }
    } 
}

I’m also no longer using Softserial, I’m aware the Softserial library uses delay() internally, I’ve run out SRAM on the Uno/Nano long back and hence the final project is now running on an Atmel ATmega1284p, more details on that here if interested. I use a Uno/Nano to test various components before adding to the main project setup code. The Atmel 1284p has a second hardware serial, which is being used to communicate with the ESP8266-01.

However I have the serial link still set to 9600 baud for backwards compatibility with the Uno/Softserial, and I can see the ESP8266 WiFi is pretty much in constant transmit mode (blue light is on 90% of the time), even though the data volume being sent should be low. It may be that WifiEsp is keeping the connections active too long perhaps due to the low serial speed. As I said I’ll debug some more…many thanks for your tips. :grinning:

Cheers

However I have the serial link still set to 9600 baud for backwards compatibility with the Uno/Softserial, and I can see the ESP8266 WiFi is pretty much in constant transmit mode (blue light is on 90% of the time), even though the data volume being sent should be low. It may be that WifiEsp is keeping the connections active too long perhaps due to the low serial speed.

Hey Pete,

Bumping up the serial link speed to 115K baud to the ESP8266-01 solved the problem! :grin:
WiFiEsp was likely the culprit waiting to transmit the data on the serial link. Since the change, it’s been solid for the past 10-15 mins, and I fell back to the default lower Blynk heartbeat and timeout values to boot, and for reference, Blynk ping times reported went down from 38ms to 8ms.

Cheers

1 Like