4

My system is stuck in an unbootable state, the thing we used to know as "Safe mode" no longer really exists in a meaningful way in Windows 10 build 1909.

I have reason to believe that windows updates have trashed the content of the BCD registry and in a way that the command line prompt tools cannot repair. (If you boot from a recovery usb media, and go to a command prompt, and run bootrect /rebuildbcd, it reports zero windows installations detected, a classic symptom of total BCD corruption.)

The replacement for Safe Mode, is that several failed boots leads to "Preparing Automatic Repair" which never gets to the repair screens. 50% of the time it hangs hung prior to showing the repair screen, and 50% of the time, it completes its automatic repair and reboots, or says it couldn't repair anything, and lets you access advanced options, all of which also do nothing to restore reboot. BIOS options, especially UEFI specific boot options, have been noted, and tried one by one, in all possible combinations. (Two to the power of 8 combinations, of UEFI with secure boot on and off, SGX on and off, powerstep on and off, power save on and off, and so on.)

The system is a Lenovo thinkpad (edge series) E560 which has Lenovo Vantage/Update on it and has been updated to the latest bios and drivers as of April 24, 2020. The system runs Windows 10 build 1709 fine, and as long as it is never booted beyond that build it never bricks itself.

The system boots to this screen:

enter image description here

Recently Microsoft has been releasing updates that may or may not contribute to systems becoming unbooteable (known colloquially as "bricking my system").

I can't confirm if one of those updates is on my non-bootable system but I can state the following symptoms, which include that the failure to boot always happens after installing an update.

I have reproduced the issue now ten times, each time from a fresh windows install, and in one case, the updates were installed and NO third party apps or drivers or anything were installed, and the system failed to boot from its consumer SSD main disk, one case reproduced with UEFI boot, disk is formatted GPT, one case it's formatted MBR, and bios is set for non-UEFI legacy boot.

Windows 10 versions 1709 and prior have had no boot up problems on my laptop, but starting with build 1803, continuing into build 1903 and 1909, Microsoft Updates have routinely (more than 10 times) bricked my laptop. The first failure to boot has always occurred after a windows update screen tells me I need to reboot after updates.

The system has been bricked by updates. By bricked, I mean, you get to the blue windows logo and spinning dots and the system will never complete boot. Unlike Windows XP or even Windows 7 there is relatively little logging or on screen messaging to show where and when the boot has failed.

I have a working Win7 sata disk I can reinsert when I get tired of Windows 10 doing this to me.

Periodically when a new Windows build becomes available, I download the Windows Media Creator tools and try again. and within two days, and within 10 boot ups, Windows 10 always becomes unbooteable again. This has happened ever since build 1709. Most recent 8 fresh reinstallation tries here are on build 1903 and build 1909.

Answers to this question in the form "reinstall everything" are not helpful as I am already well aware of how to do it and have done it ten times.

I have determined that while UEFI bios settings can prevent windows from booting, none of the options available in my bios will make this machine boot again, and none of them changed prior to the boot problems, so let's call the BIOS settings a non-issue.

I have determined that you can get some information from information from system32\logfiles\srt folder and from some folders starting with $ in the filename in the root of the main C: drive, if you boot to a command prompt or a recovery boot CD (even a bootable linux distro, so you can browse files on the disk). I have determined that the file SrtTrail.txt says a recent driver update may be responsible, but I don't know what to do with that information.

I have determined that command line from the recovery environment will not let me get anywhere because "bootrec /rebuildbcd" reports 0 windows installations found, which is often a sign of BCD problems. The tool will not do anything.

I have determined that MBR and GPT must match the boot setup of the laptop. If the disk is MBR and the bios is legacy, all good. If the disk is GPT and the bios is set UEFI, all good. These are checked out and are correct.

There is a feature in Windows introduced over 1 year ago to automatically remove updates. Screenshot of that advanced mode is shown here for people who have never seen it:

enter image description here

That feature refuses to uninstall anything, and the recovery features that can go back to a previous saved snapshot (System Restore) are not accessible.

Question: Is there anything I can do to repair this? (Probably in the form of some commands you can run from the recovery winre command prompt)

Warren P
  • 2,931
  • 8
  • 37
  • 53
  • Windows updates has always been full of bugs, especially in windows 10. So, I have stuck myself at v1709 and I am never going to update it soon. I have experienced such bootloop many times after updates, and in all those cases ended up giving a fresh install of windows 10 v1709. –  Apr 26 '20 at 17:28
  • Lenovo systems (my own and my clients) have survived updates through V1909 (and V2004 on one of my own systems) and continued to boot. Why your system might not boot: out of date BIOS (update it) and possibly legacy software (update to newest versions if you can). If you get started, use Software Updater to update all drivers. – John Apr 26 '20 at 17:35
  • I rather suspect that Lenovo's bios updates are the CAUSE of the problem. I should state in the question the actual hardware and that the bios is up to date. I'm also wondering about the dozens of bios CONFIGURATION state combinations that could be untested and thus non working, by Lenovo or Microsoft. I have now tested both legacy and UEFI boot system builds (from a fresh install of 1903 or 1909) and both have gone this way, one even without any third party software install, build 1909 on a MBR partition table, fresh install died on its FIRST reboot after installing windows. – Warren P Apr 26 '20 at 17:36
  • Also notable is that KB4549951 seems to have caused the boot loop on my system. It was installed today and the next boot up afterwards caused boot failures. – Warren P Apr 26 '20 at 18:02
  • I have learned a lot in the last 24 hours from fighting this problem but if anything the whole thing has just gotten more and more bizarre and hopeless. The boot time landscape of PC and UEFI and EFI, and SGX and everything is hairy as hell. And these poorly documented areas of Windows appear to have changed in Windows 10 v19xx. For example the BCD files are not in \boot\bcd, and the commandline tools to rebuild the BCD no longer work in Windows 10 version 1909 build 18363.778 – Warren P Apr 27 '20 at 22:25
  • 1
    This has nothing to do with a boot loop due to a faulty driver or bad update, but when you must repair your boot sector, you can try "bootsect /nt60 : /mbr" from a command line on an install medium ( must be replaced by your system drive, e.g. "C:\"). In my case, on an aging Lenovo laptop, the latest KB started to install, then I got the screen saying that the update had changed its mind and would I please wait for Windows to put things back the way they were. Don't remember the exact KB number, ended with "45". No wonder... –  Apr 28 '20 at 17:03
  • 1
    Re-install. Windows has a lot of problems. Or try to reapir the MBR from a live USB or Windows CD. Here are other options: https://www.dell.com/support/article/cs-cz/sln300987/how-to-repair-the-efi-bootloader-on-a-gpt-hdd-for-windows-7-8-8-1-and-10-on-your-dell-pc?lang=en – CFCBazar Apr 28 '20 at 18:38
  • BCD doesn’t use or contain a “registry”, most updates delivered through Windows Update, would never update your boot configuration data. Microsoft has confirmed there might be a possible problem with [KB4549951](https://docs.microsoft.com/en-us/windows/release-information/status-windows-10-1909#400msgdesc). I would configure Windows to postpone Cumulative updates for 14 days to avoid issues like you describe. However, Safe Mode, most certainly does exist. While it’s understandable your upset your question has way to many questions, and is written where your frustration, is clearly shown. – Ramhound Apr 28 '20 at 22:35
  • You will have to be more precise in your description of this unknown mechanism to remove updates. – Ramhound Apr 28 '20 at 22:40
  • The mechanisms within WinRE for removing updates are well documented on the internet: https://www.winhelponline.com/blog/uninstall-windows-10-update-offline-windows-recovery/ – Warren P Apr 28 '20 at 23:38
  • I have removed all the "multipart" questions from this question. I have updated the rest in the form of variables that have been isolated and discarded, leaving no further options. The BCD may not be called a registry but it also may be called that. https://neosmart.net/forums/threads/every-time-i-run-easybcd-it-tells-me-the-opening-bcd-registry-error.9058/ – Warren P Apr 28 '20 at 23:45

1 Answers1

1

Unfortunately, after dozens of hours of study of this problem, I can say for sure that:

  1. Some third party commercial tools claim to help in this case, but none of them help with this issue. Nor does Microsoft's own recovery toolset help with this issue.

  2. This issue is not a mere issue of BCD Store corruption, nor something that can be fixed by the boot repair features in windows 10.

  3. This issue reproduces on specific hardware, repeatably, but not on other hardware.

Thus:

A. A reinstall is inevitable.

B. A reinstall of the same build 1903 or 1909 on hardware that corrupted itself once will do it again, probably due to bugs in windows.

C. If you do get it to boot again after a week of not trying, and it was to boot up, run sfc /scannow and you may see crazy things in there, including but not limited to this in CBS.log:

2020-06-18 15:12:57, Info                  CBS    Failed to load persisted information for session: 30809196_732444271 [HRESULT = 0x800f0840 - CBS_E_SESSION_CORRUPT]
2020-06-18 15:12:57, Info                  CBS    Failed to load session: 30809196_732444271 [HRESULT = 0x800f0840 - CBS_E_SESSION_CORRUPT]
2020-06-18 15:12:57, Error                 CBS    Failed to load Session:30809196_732444271 [HRESULT = 0x800f0840 - CBS_E_SESSION_CORRUPT]
2020-06-18 15:12:57, Info                  CBS    Failed to initialize session, szSessionID: 30809196_732444271 [HRESULT = 0x800f0840 - CBS_E_SESSION_CORRUPT]
2020-06-18 15:12:57, Info                  CBS    Failed to call QuerySessionStatus on TiWorker session [HRESULT = 0x800f0840]

Maybe windows 20xx builds will be less rubbish. Hope these tips help someone.

I do not know what the "session corrupt" thing is, but I expect that whatever DISM stores its state into, is itself corrupted.

Warren P
  • 2,931
  • 8
  • 37
  • 53
  • _(Perhaps this can help someone else)_ Since it occurs on a clean install, it's a hardware issue, either with a component itself or a driver for a component - run the built-in UEFI firmware hardware diagnostics, first a short test of all components, then a long. Did you check the RAM modules and S.MA.R.T output for the drive _(for SSDs, the wear leveling value is important)_: `smartctl -a /dev/`? The `CBS` session error likely refers to `%WinDir%`||`%WinDir%\WinSxS` [corruption](https://superuser.com/a/1579031/529800). For `BootRec`, always issue: `BootRec /FixMBR && BootRec /RebuildBCD` – JW0914 Mar 27 '22 at 13:48