Recover Juniper SRX from failed boot

[Edit 2018-05-14] This article describes how to boot into the backup partition from u-boot. If your primary still boots, try “request system reboot media internal” (Branch SRX) or “request system reboot media disk” (High-End SRX) instead. Thanks to batdosi for this. They also mention sysctl from shell, that may be another way to set the bootdev variable if the system still boots.

I have a Juniper SRX240H in the lab. I decided to load a beta version of JunOS, which brought the unit into a state where it did not successfully boot, and where I could not use the loader> prompt to recover from TFTP.

The symptoms were:

  • During boot, the SRX would experience a fault and enter the db> prompt. I believe this to be a debugger, possible gdb. “c” will cause it to reboot again
  • If I enter the loader> , I cannot execute setenv – I get a “stack underflow” error. This means I cannot install JunOS from TFTP

I may have been able to recover this system using a USB key, but I am remote to my lab: All I have is serial console.

I resolved the issue by entering u-boot instead of the loader. u-boot prompts right after boot, and the loader prompt is shown shortly thereafter. The u-boot prompt is “Press SPACE to abort autoboot in 1 seconds”, and the loader prompt is “Hit [Enter] to boot immediately, or space bar for command prompt.”

In u-boot, I issued this command:

=> getenv

This showed me that boot.current=primary

I changed this to the alternate slice, which still held a working copy of JunOS:

=> setenv boot.current alternate
=> boot

The system came up successfully and warned me that I had booted from the alternate slice, and it rebuilt the primary slice:

**                                                                   **
**                                                                   **
**  It is possible that the primary copy of JUNOS failed to boot up  **
**  properly, and so this device has booted from the backup copy.    **
**                                                                   **
**  The primary copy will be recovered by auto-snapshot feature now. **
**                                                                   **

The auto-snapshot feature that was used here needs to be configured (set system auto-snapshot) and supported by the version of JunOS you’re running.

Lastly, I confirmed that the snapshot had been repaired, then rebooted:

root@SRX-Lab-2> show system snapshot media internal
Information for snapshot on       internal (/dev/da0s1a) (primary)
Creation date: Nov 13 12:53:04 2013
JUNOS version on snapshot:
  junos  : 12.1X44-D20.3-domestic
Information for snapshot on       internal (/dev/da0s2a) (backup)
Creation date: Oct 4 17:13:17 2013
JUNOS version on snapshot:
  junos  : 12.1X44-D20.3-domestic
root@SRX-Lab-2> request system reboot