Is there a problem with the Blynk v2.0 server?

I don’t understand your comment & questions in relation to the link I provided. It makes no mention of Blynk Edgent and if you’re planning on going down the Edgent route then that’s a whole different conversation.

Pete.

I’m sorry for any confusion Pete. It is in the URL as …blynk.edgent-firmware… etc.
But you have answered my question. Thank you.
I’ll go away and work out a possible fix now and seek your comments later, if I may.
Dave

This sketch might help point you in the right direction…

It doesn’t have any ESP.reset() option after multiple failed connection attempts, but at least it will point you in the right direction.

Pete.

Thanks.

Had to revisit this as my system has become completely unreliable.
History:

  1. Worked for several years as a legacy system with no dropouts.
  2. Prior to the ending of the legacy system, started to go offline more and more often, requiring a physical reboot to re-connect. I assumed this might be a nudge from the legacy servers to push me into V2.
  3. Converted code to V2 and all good for many months.
  4. Dropouts again.
  5. Added timeout event code as per Peter’s excellent suggestion above.
  6. No improvement.
  7. Went for the ‘nuclear option’! Added an external timeclock that killed ESP power for 5 minutes at midnight.
  8. Good for some months, then it went into a bistable mode where every other boot-up resulted in no connection. So good one day, but off the next.
  9. Added an external physical watchdog. This would pulse the RST line of the ESP if a 1 second toggling output in the timer subroutine didn’t occur. This can’t fail, I thought - it will keep on resetting until the void loop is executing, which only happens when we have a successful connection. And so it did until 12 hours later, and it’s offline again.
    This remote monitoring station is a 100 mile round trip and basically it doesn’t work anymore. It did.
    All suggestions gratefully received.

These sound like router (or possibly ISP) issue to me.
Have you tried adding a system to reboot your router regularly?

There’s also a possibility that it’s a DNS issue, with the wrong server being returned by the DNS lookup. This would most likely give an “Invalid Auth token” message in your serial monitor, but I guess you’re not monitoring that.
Bad DNS lookup is more common with GSM modems, due to the nature of the beast, but it can happen with other systems too. I think the Blynk servers try to mitigate this in a way, but u guess that’s not always possible.
The solution to this is probably to include the subdomain for your regional server in the Blynk.begin() command.

Experimenting with SSL or other ports may also be worthwhile.

If it’s a 100 mile round trip then I’d probably have multiple devices at the remote location, each using a different strategy to see which works best.

Pete.

Pete, many thanks for your kind help.
The router is a SIM based 3G/4G device that independently uploads a couple of CCTV cameras, so I can confirm that it is still active. (Interestingly, I have forced it to 3G because in auto mode it would always favour the 4G connection. I discovered that whilst the 4G gave much better download speeds, its upload rate was much worse than the 3G!)
I can understand that it might fail to connect on occasions, I would have thought the continual reboot would eventually succeed.
When I have been on-site when connection has failed, the serial debug output confirms router connection, but no response. I’ll try to put together some output text. I can’t remember how to include code/text.

That doesn’t mean much in my experience.
Just because the device has established a WiFi connection to the router and obtained an IP address (assuming that is happening) it doesn’t mean that the router will route the data packets correctly.

Also, having CCTV cameras connected and working doesn’t mean that the router is functioning correctly. It could still be misrouting the Blynk data packets.

I’d certainly try rebooting the router when you have these issues, and I’d put the regional subdomain in the Blynk.begin() command. I’d also ensure that you’re using port 8080 or 443 rather than the default port 80, which the ISP may be blocking/throttling/terminating if they suspect that you’re costing a web page.

Are you using a Vodafone sim in your router by any chance?

BTW, I just looked-up your location to see if you were in South Africa (there have been issues with Vodafone SA in the past) and realised that you are in Nottingham - my neck of the woods originally.

Pete.

Thanks Pete.
Yes, Nottingham and Three.
I’ve set up my bench version and a hotspot on a tablet with a fairly crappy 3G signal (but strong wifi).
Just running now and I see a hiccup. (Phone app not currently running.)
Also occasional and random [06] packets.
What are they, and what is “Cmd error” - I’ve searched the Blynk site and can’t find any reference.

[311566] <[14|02]w[00|06]vw[00]7[00]0
[311633] <[14|02]x[00|06]vw[00]9[00]1
[311700] <[14|02]y[00|07]vw[00]11[00]1
[311767] <[14|02]z[00|07]vw[00]13[00]0
[311834] <[14|02]{[00|07]vw[00]14[00]1
[312365] <[14|02]|[00|0B]vw[00]6[00]11.786
[312388] <[06|02]}[00|00]
[312432] <[14|02]~[00|06]vw[00]1[00]1
[312499] <[14|02|7F|00|06]vw[00]4[00]1
[312566] <[14|02|80|00|06]vw[00]7[00]0
[312633] <[14|02|81|00|06]vw[00]9[00]1
[312700] <[14|02|82|00|07]vw[00]11[00]1
[312767] <[14|02|83|00|07]vw[00]13[00]0
[312834] <[14|02|84|00|07]vw[00]14[00]1
[313365] <[14|02|85|00|0B]vw[00]6[00]11.786
[313432] <[14|02|86|00|06]vw[00]1[00]1
[313499] <[14|02|87|00|06]vw[00]4[00]1
[313566] <[14|02|88|00|06]vw[00]7[00]0
[313633] <[14|02|89|00|06]vw[00]9[00]1
[313700] <[14|02|8A|00|07]vw[00]11[00]1
[313767] <[14|02|8B|00|07]vw[00]13[00]0
[313834] <[14|02|8C|00|07]vw[00]14[00]1
[314365] <[14|02|8D|00|0B]vw[00]6[00]11.802
[314432] <[14|02|8E|00|06]vw[00]1[00]1
[320836] <w[00]1[00]1
[327086] Cmd error
[327388] Connecting to blynk.cloud:80
[hostByName] request IP for: blynk.cloud
[hostByName] Host: blynk.cloud IP: 159.65.55.83
[333391] Connecting to blynk.cloud:8080
[hostByName] request IP for: blynk.cloud
[hostByName] Host: blynk.cloud IP: 159.65.55.83
[339396] Connecting to blynk.cloud:80
[hostByName] request IP for: blynk.cloud
[hostByName] Host: blynk.cloud IP: 159.65.55.83
[343197] <[1D|00|01|00|20]30fSk8zmxzsd3TWGl9SZxPLjePRaB0B1
[343707] >[00|00|01|00|C8]
[343707] Ready (ping: 510ms).
[343707] Free RAM: 45072
[343774] <[11|00|02|00]zmcu[00]0.0.0[00]fw-type[00]TMPLr84IvuN1[00]build[00]Oct[20]14[20]2023[20]17:51:37[00]blynk[00]1.3.2[00]h-beat[00]45[00]buff-in[00]1024[00]dev[00]ESP8266[00]tmpl[00]TMPLr84IvuN1
[343854] >[00|00|02|00|C8]
[344365] <[14|00|03|00|0B]vw[00]6[00]11.725
[344433] <[14|00|04|00|06]vw[00]1[00]1
[344500] <[14|00|05|00|06]vw[00]4[00]1
[344567] <[14|00|06|00|06]vw[00]7[00]0
[344634] <[14|00|07|00|06]vw[00]9[00]1
[344701] <[14|00|08|00|07]vw[00]11[00]1
[344768] <[14|00|09|00|07]vw[00]13[00]0
[344835] <[14|00|0A|00|07]vw[00]14[00]1
[345365] <[14|00|0B|00|0B]vw[00]6[00]11.740
[345433] <[14|00|0C|00|06]vw[00]1[00]1
[345500] <[14|00|0D|00|06]vw[00]4[00]1
[345567] <[14|00|0E|00|06]vw[00]7[00]0
[345634] <[14|00|0F|00|06]vw[00]9[00]1
[345701] <[14|00|10|00|07]vw[00]11[00]1
[345768] <[14|00|11|00|07]vw[00]13[00]0
[345835] <[14|00|12|00|07]vw[00]14[00]1```

Ah found the ``` for posting code (why isn’t it mentioned in the instructions???).
I still can’t end the copied code!

Have now moved the tablet into a filing cabinet drawer so still good wifi, but the Three signal is stopped.

[473751] <[14|04|A6|00|0B]vw[00]6[00]11.802
[473818] <[14|04|A7|00|06]vw[00]1[00]1
[473885] <[14|04|A8|00|06]vw[00]4[00]1
[473952] <[14|04|A9|00|06]vw[00]7[00]0
[474019] <[14|04|AA|00|06]vw[00]9[00]1
[474086] <[14|04|AB|00|07]vw[00]11[00]1
[480496] <vw[00]11[00]1
[486520] Cmd error
[486821] Connecting to blynk.cloud:80
[hostByName] request IP for: blynk.cloud
[hostByName] Host: blynk.cloud IP: 159.65.55.83
[492825] Connecting to blynk.cloud:8080
[hostByName] request IP for: blynk.cloud
[hostByName] Host: blynk.cloud IP: 159.65.55.83
[498830] Connecting to blynk.cloud:80
[hostByName] request IP for: blynk.cloud
[hostByName] Host: blynk.cloud IP: 159.65.55.83
[504833] Connecting to blynk.cloud:8080
[hostByName] request IP for: blynk.cloud
[hostByName] Host: blynk.cloud IP: 159.65.55.83
[510841] Connecting to blynk.cloud:80
[hostByName] request IP for: blynk.cloud
[hostByName] Host: blynk.cloud IP: 159.65.55.83
[516845] Connecting to blynk.cloud:8080
[hostByName] request IP for: blynk.cloud
[hostByName] Host: blynk.cloud IP: 159.65.55.83
[522850] Connecting to blynk.cloud:80
[hostByName] request IP for: blynk.cloud
[hostByName] Host: blynk.cloud IP: 159.65.55.83
[528853] Connecting to blynk.cloud:8080
[hostByName] request IP for: blynk.cloud
[hostByName] Host: blynk.cloud IP: 159.65.55.83
[534861] Connecting to blynk.cloud:80
[hostByName] request IP for: blynk.cloud
[hostByName] Host: blynk.cloud IP: 159.65.55.83

These are pings.
Ther’s some info about what these codes mean here…

I think the Cmd Error is because the previous data sent by the device was odd…

And the server responded that it didn’t understand. I’m not sure what caused this, but it seems to be device side (outgoing data) rather than a server side issue.
Maybe posting your code would help to identify possible causes.

It seems that you have Blynk debugging enabled, and also core level debugging via the IDE. This is handy for debugging, but probably not ideal for long term use as it adds to the device’s overhead.

I was born and raised around the Derbyshire/Nottinghamshire border and worked as a photographer for British Coal, based out of Bestwood, back in the late 70’s and 80’s. Eventually moved to London in the early 90’s, but still have friends and family in the area, so visit now and again.

Pete.

I know it well. I had a friend in the 70s who worked at the Bestwood workshops on some pretty heavy electrical control systems. A very useful contact when we need meaty electronic components!

I noticed that the above repeated, and failing, (around every second) IP requests to the cloud would result in a call to my 1 second event.timer subroutine about every 10 seconds.
That routine also toggles the watchdog and would be sufficient to keep it alive and preventing a forced re-boot. After a couple of minutes of failed requests, the below happens and the calls to the event.timer then revert to every second - along with watchdog toggling.
So my strategy fails!

I need to add an if (Blynk.connected) before my watchdog toggling, I think.

[hostByName] Host: blynk.cloud IP: 159.65.55.83
[600917] Connecting to blynk.cloud:8080
[hostByName] request IP for: blynk.cloud
[hostByName] Host: blynk.cloud IP: 159.65.55.83
[606925] Connecting to blynk.cloud:80
[hostByName] request IP for: blynk.cloud
[hostByName] Host: blynk.cloud IP: 159.65.55.83
[612929] Connecting to blynk.cloud:8080
[hostByName] request IP for: blynk.cloud
[hostByName] Host: blynk.cloud IP: 159.65.55.83
[618934] Connecting to blynk.cloud:80
[hostByName] request IP for: blynk.cloud
[hostByName] Host: blynk.cloud IP: 159.65.55.83
[624937] Connecting to blynk.cloud:8080
[hostByName] request IP for: blynk.cloud
[hostByName] Host: blynk.cloud lookup error: No response (-5)!
[624949] Connecting to blynk.cloud:80
[hostByName] request IP for: blynk.cloud
[hostByName] Host: blynk.cloud lookup error: No response (-5)!
[624954] Connecting to blynk.cloud:8080
[hostByName] request IP for: blynk.cloud
[hostByName] Host: blynk.cloud lookup error: No response (-5)!
[629949] Connecting to blynk.cloud:80
[hostByName] request IP for: blynk.cloud
[hostByName] Host: blynk.cloud lookup error: No response (-5)!
[629955] Connecting to blynk.cloud:8080
[hostByName] request IP for: blynk.cloud
[hostByName] Host: blynk.cloud lookup error: No response (-5)!
[634950] Connecting to blynk.cloud:80
[hostByName] request IP for: blynk.cloud
[hostByName] Host: blynk.cloud lookup error: No response (-5)!
[634956] Connecting to blynk.cloud:8080
[hostByName] request IP for: blynk.cloud
[hostByName] Host: blynk.cloud lookup error: No response (-5)!

Yes. That confirms that my original watchdog strategy was flawed.
I only tested it by turning off the hotspot - i.e. the wifi.
This results in continuous attempts to connect the wifi within void loop but NO timer.run().
If I’d tested it like just now with a wifi connection, but no internet, I’d have discovered that time.run() is called (but very much delayed) initially, but then, after the ‘No response (-5)’ event, it is called at full speed. Thus my watchdog is maintained…
So now I know why my remote system was not re-booted and what state it is currently in.
Back to the drawing board…
I’d still like to know why my system, in its original form, would fail occasionally. It seems like Blynk.run() is not entirely bombproof when the internet connection is unreliable.
Here is a shortened version of my code:

#define BLYNK_TEMPLATE_ID ""  
#define BLYNK_TEMPLATE_NAME ""
#define BLYNK_AUTH_TOKEN ""

#define BLYNK_PRINT Serial
#define BLYNK_DEBUG     //shows comms with wifi/server


#include <ESP8266WiFi.h>
#include <BlynkSimpleEsp8266.h>

char auth[] = BLYNK_AUTH_TOKEN;

// Your WiFi credentials.
char ssid[] = "";
char pass[] = "";

// Definitions
float volts = 0; 

//On board relays - High to operate
int SolarRlyOff = LOW;   //D2  Low = Relays 1&2 OFF N/C used
int BattRlyOff = LOW;    //D1  Low = Relays 3&4 OFF N/C used

bool watchdog;

BlynkTimer timer;  //create a Blynk Timer object

// This function is called every time a Virtual Pin state changes via app
BLYNK_WRITE(V0) //Battery Switch (1 = off = relay op'd)
{
  // Set incoming value from pin V0 to a variable
  int BattRlyOff = param.asInt();

   if(BattRlyOff==HIGH)   //ensure that if batt is disconnected, solar is too
    {
      digitalWrite (D2,HIGH); //turn off Solar if Battery turned off
    }
   digitalWrite(D1,BattRlyOff);  //and set Battery relay
}

BLYNK_WRITE(V3) //Solar Switch (1 = off = relay op'd)
  {
    int SolarRlyOff = param.asInt();
    digitalWrite(D2,SolarRlyOff);                            
  }

}

void myTimerEvent() //Called every second
{
  volts = analogRead(A0);   // reading the analog i/p every second 
  Blynk.virtualWrite(V6,volts*12/786);  // volts=786 @12v

  BattState = digitalRead (D1);   //0 = relay off on main board = solar/batt ON (n/c contacts)
  SolarState = digitalRead (D2);  //current state of solar and battery relays
   
    if(BattState==HIGH)   //ensure that if batt is disconnected, solar is too
      {
        digitalWrite (D2,HIGH);
      }
      
//Watchdog
  if (WiFi.status() == WL_CONNECTED) // Toggle D8 every second to keep Watchdog fm forcing reset
  {
    watchdog = !watchdog;
    digitalWrite (D8,watchdog);
  }

   Blynk.virtualWrite(V1, !BattState);  //can't find a way of inverting LED state on app
   Blynk.virtualWrite(V4, !SolarState);
 
}

void setup()
{
  Serial.begin(115200);
  Serial.print("Restart Hougham23a... ");        // Version *****************************
  Blynk.begin(auth, ssid, pass);
  timer.setInterval(1000L, myTimerEvent);
  pinMode(D1,OUTPUT);   //Battery
  pinMode(D2,OUTPUT);   //Solar
  pinMode(D7,INPUT_PULLUP); //Volts
  pinMode(D8,OUTPUT);   //Watchdog

  digitalWrite(D1,LOW);
  digitalWrite(D2,LOW);
}

void loop()
{
  Blynk.run();
  timer.run();
}

Just noticed I had included a check for wifi connected prior to toggling the watchdog! So timer.run() may well be being called in both scenarios, after all. So will try adding a check for Blynk connection as well.

TBH, because Blynk.begin() is a blocking function, I think you’d be better managing your WiFi connection manually then using Blynk.config to define your Auth token, server, port etc and Blynk.connect(timeout) to initiate the Blynk connection.

Here’s a bit of code that I created for something else, but that shows how the process works…

/* Fill-in information from B;ynk Web Console > Device > Device Info here..
These three lines of code MUST be at the top of your sketch */
#define BLYNK_TEMPLATE_ID "REDACTED"
#define BLYNK_TEMPLATE_NAME "REDACTED"
#define BLYNK_AUTH_TOKEN "REDACTED"

#define BLYNK_PRINT Serial

#include <ESP8266WiFi.h>
#include <BlynkSimpleEsp8266.h>

// Your WiFi credentials.
// Set password to "" for open networks.
char ssid[] = "REDACTED";
char pass[] = "REDACTED";

// ADJUST THESE SETTINGS TO SUIT YOUR NEEDS:

// The longer you wait between WiFi connection attempts and the more attempts you have,
// the more time your device will be tied-up if the WiFi network is down.

// The longer the Blynk timeout is, the more time your device will be tied-up if the Blynk server can't be reached

// The more frequently you check if your devoice is connected to WiFi and/or Blynk then the more frequently the
// situations described above will occur if the device is not currently connected to WiFi and/or Blynk 

int wait_between_wifi_attempts_millis = 750;
int max_wifi_connect_attempts = 20;
int blynk_timeout_millis = 2500;
int check_connections_frequency_seconds = 30;

BlynkTimer timer;

void setup()
{
  Serial.begin(74880);
  Blynk.config(BLYNK_AUTH_TOKEN);

  connect_to_wifi();
  connect_to_blynk();

  timer.setInterval(check_connections_frequency_seconds * 1000,check_connections);
}

void loop()
{
  if(Blynk.connected())
  {
    Blynk.run();
  }
  timer.run();
}


void connect_to_wifi()
{
  int wifi_attempt_count=0;
 
  Serial.println(F("Connecting to Wi-Fi..."));
  WiFi.mode(WIFI_STA);
      
  if (WiFi.status() != WL_CONNECTED)
  {
      WiFi.begin(ssid, pass); // connect to the network
  }

  while (WiFi.status() != WL_CONNECTED  && wifi_attempt_count < max_wifi_connect_attempts) // Loop until we've connected, or reached the maximum number of attempts allowed
  {
    delay(wait_between_wifi_attempts_millis);
    wifi_attempt_count++;
    Serial.print(F("Wi-Fi connection - attempt # "));
    Serial.print(wifi_attempt_count);
    Serial.print(F(" of "));
    Serial.println(max_wifi_connect_attempts);
  }

  // we get to this point when either we're connected to Wi-Fi, or we've tried too many times. We need to do different things, depending which it is...
  if (WiFi.status() == WL_CONNECTED)
  {
    WiFi.mode(WIFI_STA);
    Serial.println(F("CONNECTED TO Wi-Fi"));
    Serial.print(F("Local IP Address = "));
    Serial.println(WiFi.localIP());
    Serial.println();
  }
  else
  {  
    // we get here if we tried multiple times, but can't connect to WiFi...
    Serial.println(F("CONNECTION TO WiFI FAILED - IN STAND-ALONE MODE"));
   // find failure reason and add in here?
  }
}


void connect_to_blynk()
{
  if (WiFi.status() == WL_CONNECTED)
  {
    Serial.println("Connecting to Blynk...");
    Blynk.connect(blynk_timeout_millis); // Try to connect to Blynk, but timeout after pre-defined time
    if(Blynk.connected())
    {
      Serial.println("CONNECTED TO BLYNK");

    }
    else
    {
      Serial.println("FAILED TO CONNECT TO BLYNK - IN STAND-ALONE MODE");      
    }
  }  
}

void check_connections()
{
  if (WiFi.status() != WL_CONNECTED)
  {
    connect_to_wifi();
  }
  if(WiFi.status() == WL_CONNECTED && !Blynk.connected())
  {
    connect_to_blynk();
  }
}

The code doesn’t do anything other than connect to WiFi and Blynk and check those connections on a regular basis.
If you wanted to throw-in an ESP.restart() at the point when you hit “standalone mode” then that might help in your case.

The Blynk.connected() test in the void loop is to prevent the device constantly trying to re-connect to Blynk if it’s in standalone mode, but may not be needed for you if you do a restart in that situation.

The workshops at Bestwood were quite impressive. Some very heavy duty equipment was overhauled there, and it’s amazing how many ways the pits would find to break things :grinning:

Pete.

Thank you Pete.
I live in an ex-pit village.
All villages are required to have two sources of electrical power for redundancy, but a pit village would have one source for domestic and the other solely for the pit.
Normally motor sizes are restricted to 100hp direct-on-line (DOL) starting, but the NCB were permitted to start heftier motors DOL.
If our normal supply was interrupted, we’d be switched to the pit supply. You always knew when this happened as the lights would regularly dim as those big motors spun up.
And yes, a breakdown was a great excuse for a break (except for the maintenance crew…).

1 Like

Hi Pete, Thanks for your help. I integrated your connection checking code into mine and, so far, no problems after a week. I see one 6 minute off-line reported over the week, but it recovered okay.
I can only assume that the problems occur due to the relative unreliability (ping times can vary up to a second and longer, on occasions) of the 3G data link. This seems to be the only variable that is random and completely unpredictable. Seems like the Blynk libraries might not be100% bombproof in these situations?
I’ll continue to monitor the situation and report here if I learn anything new.
Thanks again (your help here on the forum is absolutely invaluable - I hope our Blynk people recognise this).
Dave

Shouldn’t that be “ey up mi duck”? :rofl:

Glad to hear that things are more reliable. Whilst I’m sure that the Blynk libraries aren’t perfect, I suspect that the issues you’re experiencing are more to do with your mobile ISP than Blynk.

Yes, I get a free PRO subscription for the work I do on the forum, which is fine by me.

I’ve marked this as “Solved” (but the topic is still open) as I think your initial problem is now solved. Feel free to add more feedback or issues in future.

Pete.

'Appen…
Soon after my previous tiding of good news, there was a break of a few hours.
But the better news is that it did eventually recover. (By itself.)
I did go on site on Sunday and discovered more anomalies - but mainly with the router.
It is forced into 3G mode (after confirming that 4G is worse in every respect).
Whilst attempting to perform Speedtests, it would often often fail either to initially connect to the test server, or perform a satisfactory download speed test, then crash out before the upload test.
There was an internet connection - I could download video and my cameras were successfully uploading video - but Speedtest would fail.
At the same time, IoT was in a reset loop, successfully connecting to the WiFi, but when requesting the Blynk IP address on port 80, then port 8080, no response was forthcoming.
Eventually, there would be a response and comms would resume.
When I look at the Timeline on my Blynk App, I can see periods of off-line, but always resuming on-line at xx:21. This corresponds to my last boot-up at 15:21 on Sunday - I check for good connection every hour! It must be doing a full reset at that point as it will be failing the WL_CONNECTED bit in both the hourly called check_connections, and subsequently in connect_to_blynk, when I do a software reset (and the hour timer is also re-initialised).
So, many thanks for your excellent help, and your correct assumption that it is my router which is at fault.
Cheers, Dave

1 Like

Just remembered that connect_to blynk includes Blynk_connect, so it may not be doing a full reset after all.