ESP8266 Hangs every ~4 days

Hi

I have creating a project that uses a Wemos D1 ESP8266 to control a fridge, heater, fan and humidifier and dehumidifier to create ideal conditions for curing meat, and reports the date to Blynk. I have been running into a weird issue where the ESP will run correctly for around 4 days, and then it will hang. By hanging I mean it just sits there as if it were stuck in a forever loop… everything is frozen, and Blynk app reports device as offline. Oddly enought, watchdog timers are not triggered, so its stuck somewhere where the watchdogs are being fed.

I tried to build the code so it will continue to run no matter what, even without internet connectivity and completely avoid while and similar loops inside loop() precisely to avoid the ESP8266 hanging.

The cause seems to by blynk related, as before it would last 1 hour before hanging. I managed to reduce the frequency of hangs to every ~4 days by adding code to check if ESP8266 is connected to internet and to blynk before doing sending/receiving data from blynk. I need to manually cycle power to ESP8266 to get the system to work again.

I need the project to run 6 months to 8 months at a time… 4 days is too short, and is even dangerous as if the ESP8266 hangs when the heater is on it will stay on until its manually reset.

Hardware Model: Wemos D1 ESP8266
Using Blynk server, Android 7.0 App
Powersupply is a dedicated 5v, 2amp regulated supply
Blynk Version: 0.5.4
Board Version: Eso 8266 version2.4.2
Before creating the topic

I have attached my code below. Please have a look and let me know if you can think what is the best way to debug this… given it crashes every 4 days, I am worried it would take forever to debug…


#include <Wire.h>
#include <LiquidCrystal_I2C.h>
#include "SHTSensor.h"
#include <ESP8266WiFi.h>
#include <BlynkSimpleEsp8266.h>


//Devices
LiquidCrystal_I2C lcd(0x27, 16, 2); // set the LCD address to 0x27 for a 16 chars and 2 line display

//out sensor
bool    out_sensor_working = false;
bool    out_sensor_issue_detected = true;
uint8_t out_sensor_issue_cnt = 0;
float   out_hum = 0;
float   out_tem = 0;
//SHT3X out(0x44);
SHTSensor out(SHTSensor::SHT3X_ALT);

//frg sensor
bool    frg_sensor_working = false;
bool    frg_sensor_issue_detected = true;
uint8_t frg_sensor_issue_cnt = 0;
float   frg_hum = 0;
float   frg_tem = 0;
//SHT3X frg(0x45);
SHTSensor frg(SHTSensor::SHT3X);


//Wifi connection
const char* ssid = "***";
const char* pass = "***";
bool    wifi_connected = false;
uint8_t wifi_issue_disconnected = 0;
uint8_t wifi_issue_ssid_not_found = 0;

//Blynk
char auth[] = "***";
bool isFirstConnect = true;
bool led_error = false;
bool led_heat = false;
bool led_cool = false;
bool led_hum = false;
bool led_dhum = false;
bool led_fan = false;

int ReCnctFlag;  // Reconnection Flag
int ReCnctCount = 0;  // Reconnection counter

BlynkTimer timer;

WidgetLED error(V2);
WidgetLED heat(V5);
WidgetLED cool(V6);
WidgetLED hum(V7);
WidgetLED dhum(V8);
WidgetLED fan(V9);

//System control variables
uint8_t system_stage  = 0; //controls whether the system is running or not, and what stage its running at 0=off, 1=hot incubation, 2 = high hum drying, 3=mid hum drying, 4=low hum drying
bool heat_enabled = false; //is heating enabled
bool heat_running = false; //Heater is enabled
bool cool_enabled = false; //is cooling enabled
bool cool_running = false; //Heater is enabled
bool in_compressor_uptime = false;
bool compressor_ready = true;
bool in_compressor_downtime = false;
uint8_t cool_comp_rest_counter = 0;
bool hum_has_fridge_stabilized = true;

bool hum_enabled = false;
bool hum_running = false;
bool dhum_enabled = false;
bool dhum_running = false;

bool fan_running =  false;
bool fan_enabled = false;
uint8_t fan_min_counter = 0;


unsigned long stage_runtime  = 0;
unsigned long enlapsed_time_minutes = 0;
float cool_hysterisis = 3;
float hum_hysterisis = 3;
float dhum_hysterisis = -5;
float target_tem = 0.0; //selects target temperature
float target_hum = 0.0; //selects target getHumidity()
uint8_t refresh_rate = 10; //sensor refresh rate in seconds
uint8_t blynk_report = 0;

//Testing
bool blynk_verbose = true;
bool system_under_test = false;
float frg_cTem = 0;
float frg_cHum = 0;
float out_cTem = 0;
float out_cHum = 0;
//LCD control
uint8_t screen_counter = 0; // controls what screen to show
bool update_screen = true; // controls lcd screen refresh

//Fan
//

//Timer
unsigned long house_keeping_timer_1sec;  //house keepins such as sending sensor status to Blynk except histograms
unsigned long histogram_update_timer; //send sensor data to blynk
unsigned long duration_timer ;  //Keeps track of how long the current stage has been
unsigned long lcd_update_timer; //update LCD timer
unsigned long house_keeping_timer_1min;
unsigned long currentMillis;
const unsigned long seconds_1  =  1000;  //the value is a number of milliseconds
const unsigned long seconds_2  =  seconds_1*2;  //the value is a number of milliseconds
const unsigned long seconds_3  =  seconds_1*3;  //the value is a number of milliseconds
const unsigned long seconds_5  =  seconds_1*5;  //the value is a number of milliseconds
const unsigned long seconds_10 =  seconds_1*10;  //the value is a number of milliseconds
const unsigned long seconds_15 =  seconds_1*15;  //the value is a number of milliseconds
const unsigned long seconds_30 =  seconds_1*30;  //the value is a number of milliseconds
const unsigned long minutes_1  =  seconds_1*60;
const unsigned long hour_1 =      minutes_1*60; //the value is a number of milliseconds
const unsigned long day_1 =       hour_1 * 24;
//Testing
int test_sequence = 0;
int test_timer = 0;


void test_inc_frg_tem(float amount) {
  if (frg_cTem < 100) {
    frg_cTem = frg_cTem + amount;
  }
}
void test_dec_frg_tem(float amount) {
  if (frg_cTem > 0.5) {
    frg_cTem = frg_cTem - amount;
  }
}
void test_inc_out_tem(float amount) {
  if (out_cTem < 100) {
    out_cTem = out_cTem + amount;
  }
}
void test_dec_out_tem(float amount) {
  if (out_cTem > 0.5) {
    out_cTem = out_cTem - amount;
  }
}
void test_inc_frg_hum(float amount) {
  if (frg_cHum < 100) {
    frg_cHum = frg_cHum + amount;
  }
}
void test_dec_frg_hum(float amount) {
  if (frg_cHum > 0.5) {
    frg_cHum = frg_cHum - amount;
  }
}
void test_inc_out_hum(float amount) {
  if (out_cHum < 100) {
    out_cHum = out_cHum + amount;
  }
}
void test_dec_out_hum(float amount) {
  if (out_cHum > 0.5) {
    out_cHum = out_cHum - amount;
  }
}



float frg_getTem() {
  if (!system_under_test) {
    return frg.getTemperature();
  }
  return frg_cTem;
}
float frg_getHum() {
  if (!system_under_test) {
    return frg.getHumidity();
  }
  return frg_cHum;
}
float out_getTem() {
  if (!system_under_test) {
    return out.getTemperature();
  }
  return out_cTem;
}
float out_getHum() {
  if (!system_under_test) {
    return out.getHumidity();
  }
  return out_cHum;
}

void blynk_display_remaining_time() {
  int days = floor(enlapsed_time_minutes / 1440);
  int reminder = enlapsed_time_minutes - days * 1440;
  int hours = floor(reminder / 60);
  reminder = reminder - hours * 60;
  int minutes = reminder;
  String timerDisplay = String(days, DEC) + " days " + String(hours, DEC) + " hours " + String(minutes, DEC) + " min";
  Blynk.virtualWrite(V1, timerDisplay);
}

BLYNK_WRITE(V120) {
  // Use of syncAll() will cause this function to be called
  frg_cTem = param.asFloat();
}
BLYNK_WRITE(V121) {
  // Use of syncAll() will cause this function to be called
  frg_cHum = param.asFloat();
}
BLYNK_WRITE(V122) {
  // Use of syncAll() will cause this function to be called
  out_cTem = param.asFloat();
}
BLYNK_WRITE(V123) {
  // Use of syncAll() will cause this function to be called
  out_cHum = param.asFloat();
}



BLYNK_CONNECTED() {
  ReCnctCount = 0;
  if (isFirstConnect) {
    // Request Blynk server to re-send latest values for all pins
    Blynk.syncAll();
    isFirstConnect = false;
  }
}

void check_conditions_heat() {
  if (target_tem - frg_getTem() > 0.0) {
    if (!heat_running) {
      heat_running = true;
      led_heat = true;
      if (system_under_test) {
        test_inc_frg_tem(0.5);
        //test_dec_frg_hum(0.25);
      }
      //Add code to turn on heat
      digitalWrite(D6, HIGH);   // turn the LED on (HIGH is the voltage level)
    }
  } else {
    turn_off_heat();
  }
}

void turn_off_heat() {
  if (heat_running) {
    heat_running = false;
    led_heat = false;
    //Add code to turn on heat
    digitalWrite(D6, LOW);   // turn the LED on (HIGH is the voltage level)
  }
}

void turn_off_cool() {
  Serial.print("Checking if cooling should turn off - ");
  if (compressor_ready && cool_running) {
    //Blynk.virtualWrite(V99, "Fridge off");
    cool_running = false;
    led_cool = false;
    //Add code to turn on cool
    digitalWrite(D8, LOW);   // turn the LED on (HIGH is the voltage level)
    in_compressor_downtime = true;
    compressor_ready = false;
    cool_comp_rest_counter = 7;
    cool_hysterisis = 3;
    //if(blynk_verbose){
    //  in_comp_downtime.on();
    //  comp_ready_led.off();
    //}
    Serial.print(" - TURNING OFF");
  }
  Serial.println("");
}

void check_conditions_cool() {
  //Compressor is being started, setup protection timer
  Serial.print("Checking if cooling should turn on - ");
  Serial.print(frg_getTem());
  Serial.print(" - Cool Running - ");
  Serial.print(cool_running);
  Serial.print(" - ");
  //Serial.println("Calling for cooling");
  if ((target_tem + cool_hysterisis ) < frg_getTem() ) {
    if (!cool_running) {
      if (compressor_ready) {
        in_compressor_uptime = true;
        compressor_ready = false;
        cool_comp_rest_counter = 7;
        //if(blynk_verbose){
        //  in_comp_uptime.on();
        //  comp_ready_led.off();
        //}
        //Serial.println("Uptime cooling Started!!");
      }
      Serial.print(" - TURNING ON");
      cool_running = true;
      led_cool = true;
      cool_hysterisis = -2;
      //Add code to turn on cool
      digitalWrite(D8, HIGH);   // turn the LED on (HIGH is the voltage level)
    }
  } else {
    turn_off_cool();
  }
  Serial.println("");
  if (system_under_test && cool_running) {
    test_dec_frg_tem(0.5);
    //test_dec_frg_hum(0.25);
  }
}


void turn_off_hum() {
  if (hum_running) {
    hum_running = false;
    led_hum = false;
    //Add code to turn on cool
    digitalWrite(D0, LOW);   // turn the LED on (HIGH is the voltage level)
    hum_hysterisis = 3;
  }
}


void check_conditions_hum() {
  if ((target_hum - frg_getHum() - hum_hysterisis > 0) && !cool_running && !dhum_running && hum_has_fridge_stabilized) {
    hum_running = true;
    led_hum = true;
    hum_hysterisis = 0;
    //Add code to turn on hum
    digitalWrite(D0, HIGH);   // turn the LED on (HIGH is the voltage level)
  } else {
    turn_off_hum();
  }
  if (system_under_test && hum_running ) {
    test_inc_frg_hum(0.5);
  }
}


void turn_off_dhum() {
  if (dhum_running) {
    dhum_running = false;
    led_dhum = false;
    digitalWrite(D7, LOW);   // turn the LED on (HIGH is the voltage level)
    dhum_hysterisis = -3;
  }
}

void check_conditions_dhum() {
  if (!cool_running && (target_hum - frg_getHum() - dhum_hysterisis < 0) && !hum_running) {
    dhum_running = true;
    led_dhum = true;
    dhum_hysterisis = 0;
    //Add code to turn on dhum
    digitalWrite(D7, HIGH);   // turn the LED on (HIGH is the voltage level)
  } else {
    turn_off_dhum();
  }
  if (system_under_test && dhum_running ) {
    test_dec_frg_hum(0.5);
  }
}

void turn_on_fan() {
  if (!fan_running && fan_enabled) {
    fan_running = true;
    fan_min_counter = 3;
    led_fan = true;
    //Add code to turn on pin
    digitalWrite(D5, HIGH);   // turn the LED on (HIGH is the voltage level)
  }
}

void turn_off_fan() {
  if (fan_running) {
    fan_running = false;
    fan_min_counter = 27;
    led_fan = false;
    //Add code to turn off fan
    digitalWrite(D5, LOW);   // turn the LED on (HIGH is the voltage level)
  }
}


void util_lcd_clear_top() {
  lcd.home();
  lcd.print("                ");
  lcd.home();
}
void util_lcd_clear_bot() {
  lcd.setCursor(0, 1);
  lcd.print("                ");
  lcd.setCursor(0, 1);
}

void report_wifi() {
  util_lcd_clear_top();
  lcd.print("Wifi: ");
  if (WiFi.status() == WL_NO_SHIELD) {
    lcd.print("No Adapter");
  }
  if (WiFi.status() == WL_IDLE_STATUS) {
    lcd.print("Working...");
  }
  if (WiFi.status() == WL_NO_SSID_AVAIL) {
    lcd.print("SSID Error");
  }
  if (WiFi.status() == WL_CONNECTION_LOST) {
    lcd.print("Conn. Lost");
  }
  if (WiFi.status() == WL_DISCONNECTED) {
    lcd.print("Disconnected");
  }
  if (WiFi.status() == WL_CONNECTED) {
    lcd.print("Connected");
  }
  util_lcd_clear_bot();
  lcd.print("Blynk:");
  if (Blynk.connected()) {
    lcd.print("Connected");
  } else {
    lcd.print("Unreachabl");
  }
}



void lcd_display_temp_frige() {
  util_lcd_clear_top();
  lcd.print("FRG ");
  if (frg_sensor_working) {
    lcd.print("T:");
    lcd.print(frg_getTem(), 1);
    lcd.print(" RH:");
    lcd.print(frg_getHum(), 1);
  } else {
    lcd.print(" Error!!!");
  }
}

void lcd_display_temp_out() {
  util_lcd_clear_bot();
  lcd.print("OUT ");
  if (out_sensor_working) {
    lcd.print("T:");
    lcd.print(out_getTem(), 1);
    lcd.print(" RH:");
    lcd.print(out_getHum(), 1);
  } else {
    lcd.print(" Error!!!");
  }
}


void read_and_request_new_temp_hum() {
  //Read past result from frg, request a new conversion and update Blynk
  frg_sensor_working  = frg.readSample();
  //Read past result from out, request a new conversion and update Blynk
  out_sensor_working  = out.readSample();
  if (!frg_sensor_working) {
    led_error = true;
    frg_sensor_issue_cnt++;
  } else {
    if (frg_sensor_issue_cnt > 0) {
      frg_sensor_issue_cnt--;
    }
  }
  if (!out_sensor_working) {
    led_error = true;
    out_sensor_issue_cnt++;
  } else {
    if (out_sensor_issue_cnt > 0) {
      out_sensor_issue_cnt --;
    }
  }
}

void sensor_error(int sensor_failed) {
  //0 frg sensor
  //1 frg sensor
  switch (sensor_failed) {
    case 0:
      cool_enabled = false;
      heat_enabled = false;
      hum_enabled = false;
      dhum_enabled = false;
      fan_enabled = false;
      lcd.clear();
      lcd.home();
      lcd.print("   Frg Sensor");
      lcd.setCursor(0, 1);
      lcd.print("Error Cnt:");
      lcd.print(frg_sensor_issue_cnt);
      break;
    case 1:
      lcd.clear();
      lcd.home();
      lcd.print("   Out Sensor");
      lcd.setCursor(0, 1);
      lcd.print("Error Cnt:");
      lcd.print(out_sensor_issue_cnt);
      //Error led on
      break;
  }
}

BLYNK_WRITE(V0) {
  // Use of syncAll() will cause this function to be called
  system_stage = param.asInt();
  switch (system_stage) {
    case 1:
      //Add code for heater, cooler, fan and hum off
      target_tem = 0.0;
      target_hum = 0.0;
      stage_runtime = 0.0;
      duration_timer = 0;
      cool_enabled = false;
      heat_enabled = false;
      hum_enabled = false;
      dhum_enabled = false;
      fan_enabled = false;
      break;
    case 2:
      target_tem = 23.8;
      target_hum = 90;
      stage_runtime = hour_1 * 36;
      break;
    case 3:
      target_tem = 12;
      target_hum = 85;
      stage_runtime = day_1 * 7;
      break;
    case 4:
      target_tem = 12;
      target_hum = 75;
      stage_runtime = day_1 * 23;
      break;
    case 5:
      target_tem = 12;
      target_hum = 65;
      stage_runtime = day_1 * 23;
      break;
  }
  if (system_stage != 1) {
    blynk_display_remaining_time();
    enlapsed_time_minutes = 0;
    fan_enabled = true;
    turn_on_fan();
    hum_enabled = true;
    dhum_enabled = true;
  }
} //V0


void setup() {
  pinMode(D0, OUTPUT);
  pinMode(D5, OUTPUT);
  pinMode(D6, OUTPUT);
  pinMode(D7, OUTPUT);
  pinMode(D8, OUTPUT);


  cool_enabled = false;
  heat_enabled = false;
  hum_enabled = false;
  dhum_enabled = false;
  fan_enabled = false;



  Serial.begin(9600);
  //Serial.println("\nSetup Starting");
  lcd.init(); // initialize the lcd
  lcd.init(); // initialize the lcd

  // Print a message to the LCD.
  lcd.backlight();
  lcd.setCursor(4, 0);
  lcd.print("Gluttony");
  delay(2000);


  Wire.begin(); //Initialize Sensors


  out.setAccuracy(SHTSensor::SHT_ACCURACY_HIGH);
  frg.setAccuracy(SHTSensor::SHT_ACCURACY_HIGH);



  //Check both temp/hum sensors are accesible
  lcd.clear();
  lcd.home();
  lcd.print("FRG Sen Init");
  lcd.setCursor(0, 1);
  lcd.print("OUT Sen Init");
  bool sensors_ready = false;

  while (!sensors_ready) {
    frg_sensor_working = out.init();
    out_sensor_working = frg.init();;
    lcd.setCursor(13, 0);
    if (frg_sensor_working) {
      lcd.print("OK ");
    } else {
      lcd.print("...");
    }
    lcd.setCursor(13, 1);
    if (out_sensor_working) {
      lcd.print("OK ");
    } else {
      lcd.print("...");
    }
    if (frg_sensor_working && out_sensor_working) {
      sensors_ready = true;
    } else {
      //Serial.print("frg Code:");
      //Serial.println(frg_sensor_working);
      //Serial.print("out Code:");
      //Serial.println(out_sensor_working);
    }
    delay(2000);
  }

  read_and_request_new_temp_hum();
  lcd_display_temp_frige();
  lcd_display_temp_out();
  delay(2000);

  //Connecto to Wifi
  WiFi.begin(ssid, pass);
  lcd.clear();
  lcd.home();
  lcd.setCursor(6, 0);
  lcd.print("Wifi");
  lcd.setCursor(2, 1);
  lcd.print(ssid);
  delay(1000);
  while (WiFi.status() != WL_CONNECTED) {
    report_wifi();
    delay(1000);
  }


  lcd.clear();
  lcd.home();
  lcd.setCursor(1, 0);
  lcd.print("Connecting to");
  lcd.setCursor(4, 1);
  lcd.print("Blynk...");
  Blynk.config(auth);
  while (!Blynk.connected()) {
    delay(1000);
    Blynk.connect();
  }

  //Set Dummy Targets
  Blynk.virtualWrite(V13, 0);
  Blynk.virtualWrite(V14, 0);
  Blynk.virtualWrite(V22, 0);
  Blynk.virtualWrite(V32, 0);
  Blynk.virtualWrite(V20, 0);
  Blynk.virtualWrite(V30, 0);


  //Manually turn off all devices
  //if(blynk_verbose){
  //  in_comp_downtime.off();
  //  in_comp_uptime.off();
  //  comp_ready_led.on();
  //}

  house_keeping_timer_1sec = millis(); //initial start time
  lcd_update_timer = millis();
  house_keeping_timer_1min = millis();
  histogram_update_timer = millis();
  //Serial.println("\Setup Done");

}

void report_to_blynk(){
  if ((WiFi.status() == WL_CONNECTED) && Blynk.connected()) {
    if (led_error) {
      error.on();
    } else {
      error.off();
    }
    if (led_heat) {
      heat.on();
    } else {
      heat.off();
    }
    if (led_cool) {
      cool.on();
    } else {
      cool.off();
    }
    if (led_hum) {
      hum.on();
    } else {
      hum.off();
    }
    if (led_dhum) {
      dhum.on();
    } else {
      dhum.off();
    }
    if (led_fan) {
      fan.on();
    } else {
      fan.off();
    }
    switch (blynk_report) {
      case 0:  
        if (frg_sensor_working) {
          Blynk.virtualWrite(V22, frg_getTem());
          Blynk.virtualWrite(V32, frg_getHum());
        } else {
          Blynk.virtualWrite(V22, 0);
          Blynk.virtualWrite(V32, 0);
        }
        if (out_sensor_working) {
          Blynk.virtualWrite(V20, out_getTem());
          Blynk.virtualWrite(V30, out_getHum());
        } else {
          Blynk.virtualWrite(V20, 0);
          Blynk.virtualWrite(V30, 0);
        }
        if (frg_sensor_issue_cnt > 10) {
          Blynk.virtualWrite(V99, "Error with FRG sensor");
        }
        if (out_sensor_issue_cnt > 10) {
          Blynk.virtualWrite(V99, "Error with OUT sensor");
        }
        update_screen = true; //Update values in LCD screen
        blynk_report++;
      case 1:
        Blynk.virtualWrite(V21, out_getTem());
        Blynk.virtualWrite(V23, frg_getTem());
        Blynk.virtualWrite(V31, out_getHum());
        Blynk.virtualWrite(V33, frg_getHum());
        Blynk.virtualWrite(V13, target_tem);
        Blynk.virtualWrite(V14, target_hum);
        if (!heat_running && !cool_running) {
          Blynk.virtualWrite(V15, 5);
        }
        if (heat_running) {
          Blynk.virtualWrite(V15, 30);
        }
        if (cool_running) {
          Blynk.virtualWrite(V15, 17.5);
        }
        if (!hum_running && !dhum_running && !cool_running) {
          Blynk.virtualWrite(V16, 0);
        }
        if (hum_running && !dhum_running && !cool_running) {
          Blynk.virtualWrite(V16, 75);
        }
        if (!hum_running && dhum_running && !cool_running) {
          Blynk.virtualWrite(V16, 25);
        }
        if (!hum_running && !dhum_running && cool_running) {
          Blynk.virtualWrite(V16, 50);
        }
        update_screen = true;
        blynk_report=0;
      break;
    }
  } else {
    update_screen = false; //Update values in LCD screen
    report_wifi();
  }
}

void loop() {
  
  timer.run();
  if(Blynk.connected() && WiFi.status() == WL_CONNECTED){
    Blynk.run();
  } else if (ReCnctFlag == 0) {  // If NOT connected and not already trying to reconnect, set timer to try to reconnect in 30 seconds
    ReCnctFlag = 1;  // Set reconnection Flag
    Serial.println("Starting reconnection timer in 30 seconds...");
    timer.setTimeout(10000L, []() {  // Lambda Reconnection Timer Function
      ReCnctFlag = 0;  // Reset reconnection Flag
      ReCnctCount++;  // Increment reconnection Counter
      if (WiFi.status() != WL_CONNECTED) {
        report_wifi();
        update_screen = false;
        WiFi.reconnect();
      } else {
        Serial.print("Attempting reconnection #");
        Serial.println(ReCnctCount);
        lcd.clear();
        lcd.home();
        lcd.print("Blynk Unrreachab");
        lcd.setCursor(1, 0);
        lcd.print("Attempts:");
        lcd.print(ReCnctCount);
        update_screen = false;
        Blynk.connect();  // Try to reconnect to the server
      }
    });  // END Timer Function
  }

 
  currentMillis = millis();  //get the current "time"
  //Update Running Components This should probably be done when they toggle
  if (currentMillis - house_keeping_timer_1sec > seconds_1 ) {
    house_keeping_timer_1sec = millis();

    //Select if main temperature actuator is heating or cooling
    if (!heat_running && !cool_running && system_stage != 1) {
      if ((out_getTem() > target_tem) || (frg_getTem() > target_tem + 5)) {
        //Outside is hotter than target, need to cool
        cool_enabled = true;
        heat_enabled = false;
      } else {
        cool_enabled = false;
        heat_enabled = true;
      }
    }

    //Refresh the data from Blynk (except histograms)
    //Testing Data
    if (heat_enabled && !frg_sensor_issue_detected) {
      check_conditions_heat();
    }
    if (cool_enabled && !frg_sensor_issue_detected) {
      check_conditions_cool();
    }
    if (hum_enabled && !frg_sensor_issue_detected) {
      check_conditions_hum();
    }
    if (dhum_enabled && !frg_sensor_issue_detected) {
      check_conditions_dhum();
    }
    if (frg_sensor_issue_cnt > 10) {
      frg_sensor_issue_detected = true;
      sensor_error(0);
    } else {
      frg_sensor_issue_detected = false;
    }
    if (out_sensor_issue_cnt > 10) {
      out_sensor_issue_detected = true;
      sensor_error(1);
    } else {
      out_sensor_issue_detected  = false;
    }
    read_and_request_new_temp_hum();
    ////Serial.println("House Keeping Done \n");
  }


  if (currentMillis - histogram_update_timer >= seconds_1) {
    histogram_update_timer  = millis();
    report_to_blynk();
  }


  if (currentMillis - house_keeping_timer_1min  >= minutes_1) {
    house_keeping_timer_1min = millis();
    if (system_stage != 0 && (WiFi.status() == WL_CONNECTED) && Blynk.connected()) {
      blynk_display_remaining_time();
    }
    enlapsed_time_minutes = enlapsed_time_minutes + 1;

    //Turn on and off fan as appropiate if fan is enabled
    if (fan_min_counter > 1 && fan_enabled) {
      fan_min_counter--;
    } else {
      if (fan_running) {
        turn_off_fan();
      } else {
        turn_off_fan();
      }
    }
    if (!fan_enabled) {
      turn_off_fan();
    }

    //Update cooler compressor timeouts
    if (compressor_ready) {
      hum_has_fridge_stabilized = true;
    } else {
      hum_has_fridge_stabilized = false;
    }
    if (cool_comp_rest_counter > 0) {
      cool_comp_rest_counter--;
    } else {
      //Timer is up
      if (in_compressor_uptime) {
        in_compressor_uptime = false;
        compressor_ready = true;
      }
      if (in_compressor_downtime ) {
        in_compressor_downtime = false;
        compressor_ready = true;
      }
    }
  }

  //LCD update procedure
  if (currentMillis - lcd_update_timer  >= seconds_2 ) {
    lcd_update_timer = millis();
    if (update_screen) {
      switch (screen_counter) {
        case 0:
          report_wifi();
          screen_counter ++;
          break;
        case 1:
          lcd_display_temp_frige();
          lcd_display_temp_out();
          screen_counter =0;
          break;
      }
    }
  }
}


Hello… our goal here is to help others learn about Blynk and guide them to resources they need to learn… we are not here to troubleshoot your code for you :wink:

So with that in mind…

Your overloaded void loop() (for Blynk application) is probably the root of your issue… there could be much more, but start here.

http://help.blynk.cc/getting-started-library-auth-token-code-examples/blynk-basics/keep-your-void-loop-clean

Hi Gunner

Thanks a lot for your feedback. I understand the purpose of the forum is not to debug my code, but I am suspecting the ESP8266 hanging is related to how I am using Blynk… Just adding code for checking if ESP8266 is connected to blynk before doing virtualWrites increased period between hanging from a few hours to a few days. I wanted to get some feedback on if I am using it incorrectly…

I will look at improving the void loop, but dont see how I can simplify it more (code can definitely be improved)
Everything except Blynk.run() is on a timer to avoid overloading, there are basically 3 timers:

Runs 1 second timer:
Triggers a new sensor read and checks the last sensor readings.
Checks temperature/humidity is within specs
Sends data to blynk

Runs every 2 seconds
Send data to LCD

Runs every minute timer:
Calculate new uptime and check if fan should be run
Check compressor timeout

All the code is straightforward with minimal operations, and should run quite fast (way under a second).

So most of the time the ESP8266 is idlying… except for blynk.run(). Is it ok to run it so often?

Thanks
Santiago

Excepting some occasional additions, like your re-connection routine, this is an ideal void loop() with Blynk…

void loop() {
  Blynk.run();
  timer.run();
}

Yours also has all of this…

  currentMillis = millis();  //get the current "time"
  //Update Running Components This should probably be done when they toggle
  if (currentMillis - house_keeping_timer_1sec > seconds_1 ) {
    house_keeping_timer_1sec = millis();

    //Select if main temperature actuator is heating or cooling
    if (!heat_running && !cool_running && system_stage != 1) {
      if ((out_getTem() > target_tem) || (frg_getTem() > target_tem + 5)) {
        //Outside is hotter than target, need to cool
        cool_enabled = true;
        heat_enabled = false;
      } else {
        cool_enabled = false;
        heat_enabled = true;
      }
    }

    //Refresh the data from Blynk (except histograms)
    //Testing Data
    if (heat_enabled && !frg_sensor_issue_detected) {
      check_conditions_heat();
    }
    if (cool_enabled && !frg_sensor_issue_detected) {
      check_conditions_cool();
    }
    if (hum_enabled && !frg_sensor_issue_detected) {
      check_conditions_hum();
    }
    if (dhum_enabled && !frg_sensor_issue_detected) {
      check_conditions_dhum();
    }
    if (frg_sensor_issue_cnt > 10) {
      frg_sensor_issue_detected = true;
      sensor_error(0);
    } else {
      frg_sensor_issue_detected = false;
    }
    if (out_sensor_issue_cnt > 10) {
      out_sensor_issue_detected = true;
      sensor_error(1);
    } else {
      out_sensor_issue_detected  = false;
    }
    read_and_request_new_temp_hum();
    ////Serial.println("House Keeping Done \n");
  }


  if (currentMillis - histogram_update_timer >= seconds_1) {
    histogram_update_timer  = millis();
    report_to_blynk();
  }


  if (currentMillis - house_keeping_timer_1min  >= minutes_1) {
    house_keeping_timer_1min = millis();
    if (system_stage != 0 && (WiFi.status() == WL_CONNECTED) && Blynk.connected()) {
      blynk_display_remaining_time();
    }
    enlapsed_time_minutes = enlapsed_time_minutes + 1;

    //Turn on and off fan as appropiate if fan is enabled
    if (fan_min_counter > 1 && fan_enabled) {
      fan_min_counter--;
    } else {
      if (fan_running) {
        turn_off_fan();
      } else {
        turn_off_fan();
      }
    }
    if (!fan_enabled) {
      turn_off_fan();
    }

    //Update cooler compressor timeouts
    if (compressor_ready) {
      hum_has_fridge_stabilized = true;
    } else {
      hum_has_fridge_stabilized = false;
    }
    if (cool_comp_rest_counter > 0) {
      cool_comp_rest_counter--;
    } else {
      //Timer is up
      if (in_compressor_uptime) {
        in_compressor_uptime = false;
        compressor_ready = true;
      }
      if (in_compressor_downtime ) {
        in_compressor_downtime = false;
        compressor_ready = true;
      }
    }
  }

  //LCD update procedure
  if (currentMillis - lcd_update_timer  >= seconds_2 ) {
    lcd_update_timer = millis();
    if (update_screen) {
      switch (screen_counter) {
        case 0:
          report_wifi();
          screen_counter ++;
          break;
        case 1:
          lcd_display_temp_frige();
          lcd_display_temp_out();
          screen_counter =0;
          break;
      }
    }
  }
}

Yes, much is in a form of a timed if() loop, but they should really be in separate BlynkTimer timed functions of their own.

Also try to stagger timers for non-intersecting runs… remember that they all start counting down at the exact same time in the setup loop()

EG a 1 second timer will intersect with a 2 second timer every 2nd iteration and both will intersect with a one minute timer every 60 and 30 iterations respectively. Thus worst case in this scenario is that every minute you actually have three separate functions trying to run at the same time… and so on.

Unless the relatively precise timing is essential, I usually add or subtract a few ms… but with random odd numbering… 1017 (1 second), giving 17ms distance between intersecting a with 2000 (2 second) timer, which will not intersect with a 60028 (1 minute) timer, and so on.

Another good read that helps explain why a ESP8266 can “randomly” reboot…

The easiest way would be to keep your device connected to the Serial Monitor and try to find where and why it’s hanging.

When you say hanging does it still run the code but does not connect to the Cloud?

Alternatively you can send data to the Terminal widget in order to debug it.

Thanks a lot for the explanation. I will modify the code so it uses callbacks, but I am not sure if this is the root cause of the issue. Even at worse case, when all the timers intersect and the code gets executed together, there is no dependency on exact timing, and the next code requiring execution will not happen for another second and even if its executed late there is no timing dependency on exact execution.

As I mentioned, the ESP8266 is not rebooting, its simply freezes and stops executing any code. It does not reset or anything. It would be good if I could get it to reset when this happens, and it seems to be related to how I am using Blynk…

Thanks a lot for the explanation. I will modify the code so it uses callbacks, but I am not sure if this is the root cause of the issue. Even at worse case, when all the timers intersect and the code gets executed together, there is no dependency on exact timing, and the next code requiring execution will not happen for another second and even if its executed late there is no timing dependency on exact execution.

As I mentioned, the ESP8266 is not rebooting, its simply freezes and stops executing any code. It does not reset or anything. It would be good if I could get it to reset when this happens, and it seems to be related to how I am using Blynk and the quality of Wifi… Does the way I am using Blynk look reasonable?