iMX6 sometimes fails to bootup OS.
jino_a , 07-12-2021, 02:22 AM
Hello everyone,
We are using iMX6 SOM similar to TinyRex with WinCE (Windows Compact Embedded) 2013 OS and eboot (Ethernet Bootloader) as the bootloader. The bootloader is stored in a SPI flash. Our bootup process is that on start-up, the bootloader loads an OS image from a predefined location in eMMC and downloads it into RAM to boot up. But recently we're facing an issue where the bootloader fails sometimes to load the OS during boot i.e the bootloader gets stuck/crashes while reading an OS image that's stored in eMMC and therefore the OS doesn't boot up. The occurrence of this issue is rare and once this issue happens, we've to wait for a couple of minutes to start again or else the bootloader will again get stuck as mentioned above. After waiting sometime and switching on the machine, the OS boots up normally.
This issue doesn't happen on each and every SOM. We suspected that the affected SOMs' bootloader must have been corrupted in some way and tried to take a dump of boot partition (SPI flash) with MFGTool using the dd command on a couple of SOMs with this issue. On analyzing the dumps, we didn't find anything unusual in them.
Also we discovered that reflashing bootloader using MFGTool solves this issue(Not sure whether permanently or temporarily). Basically the MFGTool erases the boot partition first using flash_erase command (flash_erase /dev/mtd0 0 0) and then flashes bootloader image using dd command (dd if=$FILE of=/dev/mtd0 bs=512). After reflashing we haven't faced this issue yet. But we don’t support bootloader reflashing in field since it's a complicated procedure.
We don't have much/aren't aware of debugging facilities available for our platform. The memory dumps were the best we could do. We're perplexed because we're unsure whether it's hardware or a software issue. What makes it even more worse is that this issue doesn't have any pattern and is totally random(sometimes happens 2 in 5 times and sometimes 1 in 20 times or maybe none), so it's tough to pinpoint the root cause.
Has anyone else faced a similar issue? Any advice, suggestion or help will be much appreciated. Thanks.
robertferanec , 07-12-2021, 02:33 AM
Have you tried to boot up your boards in an environmental chamber in low / high temperature? From my experience, this kind of random problems are more visible and will happen more often when the boards are running in low temperatures (e.g. -20 or -40 C deg).
What you are describing looks to me more like hardware issue, especially because of:
- "we've to wait for a couple of minutes to start again or else the bootloader will again get stuck as mentioned above."
- and because it is random and happen occasionally
Of course, it still can be a software issue e.g some register settings etc, but usually software issue doesn't depend on waiting after the board is switched off (that usually means, something needs to be completely discharged).
jino_a , 07-14-2021, 02:31 AM
Originally posted by
robertferanecHave you tried to boot up your boards in an environmental chamber in low / high temperature? From my experience, this kind of random problems are more visible and will happen more often when the boards are running in low temperatures (e.g. -20 or -40 C deg).
What you are describing looks to me more like hardware issue, especially because of:
- "we've to wait for a couple of minutes to start again or else the bootloader will again get stuck as mentioned above."
- and because it is random and happen occasionally
Of course, it still can be a software issue e.g some register settings etc, but usually software issue doesn't depend on waiting after the board is switched off (that usually means, something needs to be completely discharged).
Hi Robert,
Thank you very much for your time and reply.
Regarding temperature, in our case we have seen this issue in normal temperature enviroment.
Could you please help us to pinpoint which hardware component can cause this kind of issue. How can we distinguish whether it's a hardware or software issue?
Also I've mentioned in my post that reflashing the bootloader does solve the issue. Any suggestion for this particular scenario?
Thanks.
qdrives , 07-15-2021, 08:00 PM
If you want to check the connection in a cold environment, you could also try something like this:
And for high temperature a normal heat gun.
Both allow you to more pinpoint the component.
robertferanec , 07-19-2021, 02:52 AM
- the point of environmental chamber test is not to see the issue, but to find out if the problem is in hardware. For example, if you see the problem more often or consistently appearing when booting up at low temperatures, that may mean a problem with hardware.
- yes, you can try the freezer spray
- it can be anything, e.g. when booting up, I have seen problems with crystals, power sequencing, current leaking (not only between components on the board but also between tested boards and connected computer), ...
- Reflashing helps: it may, it may not .. did you try to read the flash and see if the bootloader was really corrupted?
jino_a , 07-20-2021, 04:51 AM
Originally posted by
robertferanec- the point of environmental chamber test is not to see the issue, but to find out if the problem is in hardware. For example, if you see the problem more often or consistently appearing when booting up at low temperatures, that may mean a problem with hardware.
- yes, you can try the freezer spray
- it can be anything, e.g. when booting up, I have seen problems with crystals, power sequencing, current leaking (not only between components on the board but also between tested boards and connected computer), ...
- Reflashing helps: it may, it may not .. did you try to read the flash and see if the bootloader was really corrupted?
Thank you Robert & qdrives for your opinions.
- We will see the possibility of using our setup in low temperature.
- We've dumped the contents of SPI flash where the bootloader resides, using the command
dd if=/dev/mtd0 of=/img.dump bs=512 &
MFGTool s/w and then compared it with our original bootloader image and we didn't find any anomaly in the dump.
Use our interactive
Discord forum to reply or ask new questions.