ESP32 millis() overflows break the loop

Hardware:

Board: ESP32 Dev Module
Core Installation/update date: Mon Jan 7
Blynk library: 0.5.4
Blynk server: local
IDE name: Arduino IDE 1.8.6
Flash Frequency: 40Mhz
PSRAM enabled: no
Upload Speed: 921600
Computer OS: Mac OSX

Description:

Hello, I got the strange behaviour of the ESP32 millis() overflows and the blynk Heartbeat timeout. For now ESP32 has only 4294967 milliseconds (about 71 minutes) from the start before millis() overflow happens. My app perfectly works during that 4294967 milliseconds. But exactly when owerflow happens, blynk shows the Heartbeat timeout message. And sometimes after that it breaks the loop. The sketch bellow is the last version of my attempts to reach a stable work. And this last version gave me hours of stable work, but it still not predicable. Sketch can hang after 7 hours. I meant code in loop just stops to exicute. Even timer doesn’t runs enymore. But blynk app is working itself. Hadrware is online, and I can send requests to it and get responses. In loogs I see only ping responses right after the Heartbeat timeout event happend.

I tried to use third party timers, but it didn’t help. I tried to create a three separate timers and divide sync load betwean them. It doesn’t help. So, without blynk library my code is very stable. For today it gave me 116 hours uptime. But with blynk it very hard to reach at last one day of stable work. The sketch below shows pretty much all that I’m doing.

Does someone else get this behaviour?

Sketch:


#define BLYNK_DEBUG // Optional, this enables lots of prints
#define BLYNK_PRINT Serial
#define BLYNK_NO_BUILTIN   // Disable built-in analog & digital pin operations
#define BLYNK_NO_FLOAT     // Disable float operations
#define BLYNK_MSG_LIMIT 50

#include <WiFi.h>
#include <BlynkSimpleEsp32.h>

const char *SSID = "***";
const char *PSWD = "***";

char blynkAuth[] = "***";
char blynkDomain[] = "192.168.1.4";
const uint16_t blynkPort = 8080;

const unsigned long blynkConnectAttemptTime = 5L * 1000L;  // try to connect to blynk server during only 5 seconds
BlynkTimer timer;

const unsigned long uptimePrintInterval = 1L * 1000L;

// blynk sync
void blynkSyncStrings() {
    Blynk.virtualWrite(V1, "1");
    Blynk.virtualWrite(V2, "2");
    Blynk.virtualWrite(V3, "3");
    Blynk.virtualWrite(V4, "4");
    Blynk.virtualWrite(V5, "5");
    Blynk.virtualWrite(V6, "6");
    Blynk.virtualWrite(V7, "7");
    Blynk.virtualWrite(V8, "8");
    Blynk.virtualWrite(V9, "9");
}

void blynkSyncNumbers1() {
    Blynk.virtualWrite(V10, 1);
    Blynk.virtualWrite(V11, 1);
    Blynk.virtualWrite(V12, 1);
    Blynk.virtualWrite(V13, 1);
    Blynk.virtualWrite(V14, 1);
    Blynk.virtualWrite(V15, 1);
    Blynk.virtualWrite(V16, 1);
    Blynk.virtualWrite(V17, 1);
    Blynk.virtualWrite(V18, 1);
}

void blynkSyncNumbers2() {
    Blynk.virtualWrite(V19, 1);
    Blynk.virtualWrite(V20, 1);
    Blynk.virtualWrite(V21, 1);
    Blynk.virtualWrite(V22, 1);
    Blynk.virtualWrite(V23, 1);
    Blynk.virtualWrite(V24, 1);
    Blynk.virtualWrite(V25, 1);
    Blynk.virtualWrite(V26, 1);
}
//

void blynkConnect() {
    if (WiFi.isConnected() && !Blynk.connected()) {
        unsigned long startConnecting = millis();
        while (!Blynk.connected()) {
            Blynk.connect();
            if (millis() > startConnecting + blynkConnectAttemptTime) {
                Serial.println("Unable to connect to Blynk server.\n");
                break;
            }
        }
    }
}

void setup() {
    Serial.begin(115200);
    while (!Serial) { ;
    }

    WiFi.mode(WIFI_STA);
    WiFi.begin(SSID, PSWD);

    for (int loops = 10; loops > 0; loops--) {
        if (WiFi.isConnected()) {
            Serial.println("");
            Serial.print("IP address: ");
            Serial.println(WiFi.localIP());
            break;
        } else {
            Serial.println(loops);
            delay(1000);
        }
    }
    if (!WiFi.isConnected()) {
        Serial.println("WiFi connect failed");
    }

    Blynk.config(blynkAuth, blynkDomain, blynkPort);
    blynkConnect();

    timer.setInterval(30L * 1000L, blynkConnect);
    timer.setInterval(3000L, blynkSyncStrings);
    timer.setInterval(4300L, blynkSyncNumbers1);
    timer.setInterval(5400L, blynkSyncNumbers2);
}

void loop() {
    if (Blynk.connected()) {
        Blynk.run();
    }
    timer.run();

    uint32_t now = millis();
    static uint32_t lastUpdate = 0;
    static uint32_t overFlowCounter = 0;
    if (now < lastUpdate) { // overflow
        overFlowCounter++;
        lastUpdate = now;
    }
    if (now - lastUpdate > uptimePrintInterval){
        lastUpdate = now;
        Serial.print("uptime = ");
        Serial.println((overFlowCounter * 4294968) + now);
    }
}

Logs:

...
uptime = 8587666
uptime = 8588667
[4293745] <[06]p[0F|00|00]
[4293769] >[00]p[0F|00|C8]
[4293895] <[14]p[10|00|06]vw[00]1[00]1
[4293919] <[14]p[11|00|06]vw[00]2[00]2
[4293941] <[14]p[12|00|06]vw[00]3[00]3
[4293962] <[14]p[13|00|06]vw[00]4[00]4
[4293984] <[14]p[14|00|06]vw[00]5[00]5
[4294006] <[14]p[15|00|06]vw[00]6[00]6
[4294028] <[14]p[16|00|06]vw[00]7[00]7
[4294050] <[14]p[17|00|06]vw[00]8[00]8
[4294072] <[14]p[18|00|06]vw[00]9[00]9
[4294495] <[14]p[19|00|07]vw[00]19[00]1
[4294519] <[14]p[1A|00|07]vw[00]20[00]1
[4294541] <[14]p[1B|00|07]vw[00]21[00]1
[4294564] <[14]p[1C|00|07]vw[00]22[00]1
[4294586] <[14]p[1D|00|07]vw[00]23[00]1
[4294608] <[14]p[1E|00|07]vw[00]24[00]1
[4294630] <[14]p[1F|00|07]vw[00]25[00]1
[4294652] <[14]p [00|07]vw[00]26[00]1
uptime = 8589668
[0] Heartbeat timeout: 0, 4293769, 4293745
[3] Connecting to 192.168.1.4:8080
[8] <[02|00|01|00] 4c31987252054e4f83f538cb483decd3
[24] >[00|00|01|00|C8]
[24] Ready (ping: 11ms).
[45] <[11|00|02|00]Fver[00]0.5.4[00]h-beat[00]10[00]buff-in[00]1024[00]dev[00]ESP32[00]build[00]Jan 15 2019 18:23:28[00]
[49] >[00|00|02|00|C8]
[10068] <[06|00|04|00|00]
[10083] >[00|00|04|00|C8]
[20072] <[06|00|05|00|00]
[20082] >[00|00|05|00|C8]
[30076] <[06|00|06|00|00]
[30096] >[00|00|06|00|C8]
[40080] <[06|00|07|00|00]
[40089] >[00|00|07|00|C8]
[50084] <[06|00|08|00|00]
[50091] >[00|00|08|00|C8]
...
1 Like

Whatever you are doing here IS the most probable cause of your issue… you are running calculations that are possibly reaching a limit or something, and printing results every second, while all along you have other Blynk timers and their functions running in the background.

As the cartoon says “Doc, it hurts when I do this”, doctor replies “Then stop doing that”.

@Gunner the code in loop() is not doing any heavy lifting so that’s not the problem.

I think it’s just the known overflow limit with millis().

Got it… I was unsure of any timer conflicts between the two differing methods…

If this code is just to show the uptime before “crash?” then I would think the same results should be repeatable with a Blynk Timer function printing out millis(), and/or a counter if required, to avoid any potential timer conflict.

@nikuz this is what I see when millis() overflows:

11:02:02.244 -> [15914] <[14|00]7[00|06]vw[00]9[00]9
11:02:02.244 -> Uptime = 4294965920
11:02:03.164 -> [16824] <[06|00]8[00|00]
11:02:03.232 -> [16900] >[00|00]8[00|C8]
11:02:03.232 -> Uptime = 4294966921
11:02:03.883 -> [17544] <[14|00]9[00|07]vw[00]19[00]1
11:02:03.883 -> [17567] <[14|00]:[00|07]vw[00]20[00]1
11:02:03.917 -> [17589] <[14|00];[00|07]vw[00]21[00]1
11:02:03.917 -> [17610] <[14|00]<[00|07]vw[00]22[00]1
11:02:03.952 -> [17631] <[14|00]=[00|07]vw[00]23[00]1
11:02:03.986 -> [17653] <[14|00]>[00|07]vw[00]24[00]1
11:02:03.986 -> [17674] <[14|00]?[00|07]vw[00]25[00]1
11:02:04.020 -> [17695] <[14|00]@[00|07]vw[00]26[00]1
11:02:04.635 -> Uptime = 4295969
11:02:05.079 -> [18744] <[14|00]A[00|06]vw[00]1[00]1
11:02:05.079 -> [18767] <[14|00]B[00|06]vw[00]2[00]2

What the known issue with overflow limit? Can you give some more info? If the Blynk library has some known issue with overflow, is it so hard to fix? It’s mean that blynk library is not usable on the ESP32 boards with arduino for now.

Says who? I have an ESP32 running combined Blynk, Nextion and Virtuino code that displays temp, humidity and weather data 24/7.

Yep. As I said, this is my last more stable sketch. Last time I got the error only after 7 hours. Actually it’s very pain to catch this error now. But it definitely exist.

I’ll post my previous sketch which crashes stable after 71 minutes.

It’s just a general software concept that all numbers overflow, nothing to do with Blynk.

It’s why I’m started this thread, I just don’t understand what I’m doing wrong with blynk. I just trying to sync sensors data by timers. And something happens exactly on millis() overflow event. As I said, my code works just fine without blynk library. So, now I rolled back my example code to the less stable version. I’ll run it once by myself and post the result later. Difference only in using two timers, because of many sync handlers. Like so:

blynkTimer1.setInterval(30L * 1000L, blynkConnect);
blynkTimer1.setInterval(2000L, blynkSync1);
blynkTimer1.setInterval(2050L, blynkSync2);
blynkTimer1.setInterval(2100L, blynkSync3);
blynkTimer1.setInterval(2150L, blynkSync4);
blynkTimer1.setInterval(2200L, blynkSync5);
blynkTimer1.setInterval(2250L, blynkSync6);
blynkTimer1.setInterval(2300L, blynkSync7);
blynkTimer1.setInterval(2350L, blynkSync8);
blynkTimer1.setInterval(2400L, blynkSync9);
blynkTimer1.setInterval(2450L, blynkSync10);
blynkTimer1.setInterval(2500L, blynkSync11);
blynkTimer1.setInterval(2550L, blynkSync12);
blynkTimer1.setInterval(2600L, blynkSync13);
blynkTimer1.setInterval(3050L, blynkSync14);
blynkTimer2.setInterval(3100L, blynkSync15);
blynkTimer2.setInterval(3150L, blynkSync16);
blynkTimer2.setInterval(3200L, blynkSync17);
blynkTimer2.setInterval(3250L, blynkSync18);
blynkTimer2.setInterval(3300L, blynkSync19);
blynkTimer2.setInterval(3350L, blynkSync20);
blynkTimer2.setInterval(3400L, blynkSync21);
blynkTimer2.setInterval(3450L, blynkSync22);

@nikuz so you want to sync 22 variables at 50ms intervals, right?

And you are saying there is a heartbeat problem every time millis() rolls over i.e. every 71 minutes?

I’m getting data from arduino mega to ESP32 by serial communication. And sending them to local blynk cloud. Some data I sync once in a minute, but most of them I sync frequently, every few seconds for each variable. But syncing goes continuesly, because of many variables to sync.

In my live app in a loop I also have a serial read handler. But heartbeat on millis() rolls problem I also reproduced on a simple sketch. I just experimented a lot, I tried to avoid this error. I’ll try to restore that sketch, which gave me a stable error on clear blynk code.

Also do you have problems connecting to Blynk because the break after 5 seconds of trying to connect will fail to execute in blynkConnect() if you are very unlucky when millis() rolls over?

The failure would be because startConnecting would be 71 minutes higher than millis().

You could fix the potential bug by skipping the blynkConnect() function if millis() is close to rollover time.

I doubt this is your problem but it’s hard to tell from the sketches you have provided so far.

Ok, I’ll try to stop all syncs before and after one minute of the millis() overflow. I’ll write tomorrow, or when first failure happen.

Only needs to be within a fraction of a second of rollover.

Even a delay() of say 50ms when you are within 50ms of rollover would suffice.

As I say this is probably not your problem but if you are using millis() in your real sketch then it’s something to consider.

BlynkTimer is slightly different to SimpleTimer in that it will wait for one timed event to finish before starting another. I haven’t checked the BlynkTimer library to see how that handles millis() rollover.

I do not have no one delay() in my code. Also I do not use the BlynkTimer (I use the simplified pice of code of the SimpleTimer), because I can’t import the BlynkTimer from a separate file which I have allocated for the blynk functionality. Because of “multiple blynk import” (or something like that). Blynk is not main part of my project. I like to thinking about the blynk library like as statistics sync module.

50ms is not a problem.

BlynkTimer is specifically designed to be better than SimpleTimer when using Blynk. Use it.

Or contact SimpleTimer dev team.

What happens if you comment-out this line, or at the very least remove “* 4294968” from it?

Pete.

Ok, I’ll try it again. But as I said, I definitely got the same error even with BlynkTimer.