Jump to content

Pali

Members
  • Posts

    144
  • Joined

Everything posted by Pali

  1. cc: @ManoftheSea @y52 you can be interested in this.
  2. Just in case if somebody want to test or debug this issue please let me know...
  3. Ideally check if this issue with that disk happens also on some x86 computer. In any case, you can report this issue to the linux-nvme at lists.infradead.org address and ask for help there.
  4. Upstream u-boot (in its next branch) has now lot of 32-bit armada / mvebu patches related to SD/eMMC, SATA and UART booting. Also on mailing list are waiting some Clearfog eMMC patches. Likely you should take Helios patches and put then on top of the u-boot next.
  5. There were problems with PCIe bus on A3720 and with lot of patches to pci-aardvark.c PCIe controller driver it could be more stable. About year or two ago I tested in Espressobin 3 or 4 randomly chosen PCIe-based NVMe M.2 disks which I bought in local shop and all worked fine. From your log it looks like an NVMe error, not the PCIe. So I guess it would not be related to PCIe bus (and its wiring or adapter). So try to choose different NVMe disk. Maybe you hit some NVMe related issue...
  6. It is fine. I sent you private forum message with updated commit description (with tags) and comments.
  7. I see that I used _wrong_ compiler - 32-bit ARM and so all my pastes are in 32-bit ARM assembler. Not in 64-bit ARM which is Espressobin. But I think you have figured it out.
  8. I have two reasons for extra terminating nul byte: 1. If it had set then this bug would have appeared because it would have immediatelly affected all espressobin boards just by env default -a comment. 2. In case in future env allocation changes from static to dynamic (e.g. malloc-ed) then code without extra terminating nul byte cause problems because dynamic malloced memory does not have to be zeroed. Espressobin is currently the only one board which set/changes default env variables, so any such such would affect only espressobin board. And U-Boot developers do not test every core U-Boot change on Espressobin. So adding extra terminating nul byte is only for future changes in (core) u-boot code to catch possible bugs or regressions. In my opinion it is better to not expect that undefined/uninitialized data would be always zero bytes. And if other code _requires_ that after the last variable must be extra nul byte then it is better to always set it. But you are right that currently it is not needed at all and code would be correct even without it (until somebody changes it). Another option to prevent future regression in this area is to write (unit) tests which can be run during u-boot CI testing without need to have any ARM/espressobin HW. Personally I would really love to see this test but as I looked at the code it is written in such way that unit test is currently impossible without refactoring whole code which sets those default env variables. And I think it does not make sense to do this big refactoring as there was proposed better API for setting default env variables at runtime (I pointed it in u-boot email discussion as link to other patch series). In my opinion it would take same time to rewrite to the that new better API and refactoring code with ability to write unit tests. That is why I'm not going to that refactor -- but anybody else can do it and I think nobody would be against it. How to compile test.c source code to assembly file test.s: Just call gcc with -S argument, it will create file with extension small .s instead of binary a.out. You can also specify -o. So for example: arm-linux-gnueabi-gcc -Os -S test.c -o test.s And which version to choose? If we do not have an agreement what is better then the best is to ask developer responsible for u-boot platform code (which espressobin is part of), who would take and merge that patch and who is maintainer, take care to have code in a good shape and can make decision. Maintainer for mvebu is Stefan. I will send an email and ask for advice.
  9. So... if you are not against, I would propose to combine your last change with my change and send it as patch v2 with updated description.
  10. Should not happen, but ok, some check could be there. Same as for adding trailing '\0' after filling all env variables.
  11. I checked my version: char *test1(char *ptr) { while (*ptr != '\0' || *(ptr+1) != '\0') ptr++; ptr++; return ptr; } And your version: #define strlen(p) __builtin_strlen(p) char *test2(char *ptr) { int i; if (*ptr != '\0') { while (i = strlen(ptr)) { ptr += i + 1; } } return ptr; } And for my version gcc generates more efficient code: test1: .L2: mov r3, r0 ldrb r2, [r3] @ zero_extendqisi2 add r0, r0, #1 cmp r2, #0 bne .L2 ldrb r2, [r0] @ zero_extendqisi2 cmp r2, #0 bne .L2 bx lr For your version gcc generates: test2: push {r4, lr} ldrb r3, [r0] @ zero_extendqisi2 mov r4, r0 cmp r3, #0 bne .L6 .L7: mov r0, r4 pop {r4, pc} .L8: add r0, r0, #1 add r4, r4, r0 .L6: mov r0, r4 bl strlen(PLT) cmp r0, #0 bne .L8 b .L7 I compiled it with arm-linux-gnueabi-gcc -Os -S test.c I specially put there __builtin_strlen so gcc would have a chance to optimize that loop. But neither with -O2 nor with -O3 it was able to eliminate it. On the other hand for my code, gcc was able to figure out that every byte is reading two times and it is not neceserry and it can be optimized to read every byte just once. So it even generate shorter code when you compare number of instructions (important for u-boot which has space limitation). So it is interesting to know this kind of information. And I'm surprised that it also eliminated the final ptr++ statement.
  12. If you need help then please let me know. And ideally CC to my (kernel) address.
  13. No, because printenv sort variables by name. But in default_environment[] they are not sorted. For example distro_bootcmd= is after the scriptaddr= and preboot= is before arch=
  14. CONFIG_EXTRA_ENV_SETTINGS is put at the end of the default_environment[] array. So if you start overwritting default_environment[] in the middle it could result that you overwrite also first byte of the "scriptaddr" string as it is at the beginning of the CONFIG_EXTRA_ENV_SETTINGS, which means it is near the end of the default_environment[]. And as I described in some previous post, due to wrong logic of finding free space in default_environment[], it can really happen.
  15. For every eth{,1,2,3}addr (totally 4) is reserved 30 characters. Hence totally there is 4*30+60 = 180 bytes of free space in buffer. sprintf is more complicated function, strcpy is simple and compiler can optimize it (= eliminate its usage) during LTO. Hm.. looking at the code and I'm not sure right now... So pointer must point to the location where new variable starts, so one byte after the nul byte separator. Static r/w global array. This can prove my above hypothesis how criptaddr was generated. With writing this new nul byte there should not be any criptaddr anymore and also no other variable.
  16. Something like this patch: diff --git a/board/Marvell/mvebu_armada-37xx/board.c b/board/Marvell/mvebu_armada-37xx/board.c index c6ecc323bb99..0c6ac126d0e7 100644 --- a/board/Marvell/mvebu_armada-37xx/board.c +++ b/board/Marvell/mvebu_armada-37xx/board.c @@ -145,6 +145,8 @@ int board_late_init(void) strcpy(ptr, "fdtfile=marvell/armada-3720-espressobin-emmc.dtb"); else strcpy(ptr, "fdtfile=marvell/armada-3720-espressobin.dtb"); + ptr += strlen(ptr) + 1; + *ptr = '\0'; return 0; }
  17. You are right that my patch checks every byte twice, but it should not be an issue as it is still linear. My patch is simple and I prefer simple solution. Your solution is a bit complicated at the cost to read every byte just once. But because it is complicated there is issue with it -- it took me some time to find it -- you are missing ptr--; after the outer while loop otherwise you would have two nul bytes. Anyway, I would really like to hear from somebody if my patch above fixes this criptaddr issue. Or we are still dealing with another hidden problem. I think that I should send this patch to upstream because currenly, if I understand (previous) code correctly, it drops all variables after the first valid variable. And if I'm looking at it, at the end of function is missing code which will add two nul bytes into *ptr as final terminal of variables. Hm... and maybe this could be a reason for criptaddr? First issue that code dropped all variables after the first one and then second issue that final two nul byte terminal is missing which leaded to read garbage of previous bytes in storage (which probably started with criptaddr and continued with valid env structure of remaining variables).
  18. In file include/configs/mvebu_armada-37xx.h is reserved 60 bytes for fdtfile= variable. The longest value has v7-emmc variant with 51 bytes. This should be really enough. Anyway, now I traced the whole init sequence again and I think I found issue in part which tries to find free buffer in environment array. Would you be able to test this change if it helps? Variables are separated by one nul byte and the last variable is termined by two nul bytes. diff --git a/board/Marvell/mvebu_armada-37xx/board.c b/board/Marvell/mvebu_armada-37xx/board.c index c6ecc323bb99..73de2a3f002e 100644 --- a/board/Marvell/mvebu_armada-37xx/board.c +++ b/board/Marvell/mvebu_armada-37xx/board.c @@ -100,8 +100,8 @@ int board_late_init(void) return 0; /* Find free buffer in default_environment[] for new variables */ - while (*ptr != '\0' && *(ptr+1) != '\0') ptr++; - ptr += 2; + while (*ptr != '\0' || *(ptr+1) != '\0') ptr++; + ptr++; /* * Ensure that 'env default -a' does not erase permanent MAC addresses
  19. I have no reason to not trust you. But also I have absolutely no idea how it can happen. And because I was not able to trigger this issue on two different Espressobin boards, I cannot debug it.
  20. You need to debug it why it cannot load required file. Probably file does not exist or there is an error on that load line in that script.
  21. Seems that you have broken /boot/boot.scr script on mmc 0:1 device (SD card?).
  22. I do not know that excel file. Anyway all those questions are really HW specific and you should discuss it with Marvell FAE or other Marvell contact / supplier. Basically Marvell FAE is who has access to the confidential registers and who can share it to Marvell customers. DDR training is done in software - firmware which you transfer over UART (in your case). I have not seen any (useful) documantation about DDR training for A3720. Source code is open source (but it is for me just a magic, I do not understand it) and available at: https://github.com/MarvellEmbeddedProcessors/A3700-utils-marvell/tree/master/wtmi/sys_init/ddr Before DDR training (and even before transfering/loading CM3 firmware) is called TIM bytecode, part of which is static DDR configuration. This bytecode is simple write values into registers and configured DDR for static low speed mode. This code is generated during building from DDR_TOPOLOGY. So for debugging DDR, it is needed to check if DDR is working fine after executing TIM bytecode (just is very slow) and it is DDR training who break RAM access. Or if even static low speed configuration of DDR via TIM bytecode makes RAM access not possible.
  23. DDR4 is really problematic on A3720. More people here on the forum have issues with Espressobin v7 which has 1.2 GHz CPU and DDR4 memory. There are more undocumented HW issues with DDR4 which are not mentioned in errata documents. You can find something mentioned in commit messages of public marvell git repositories. Last time when I worked with DDR4 on A3720 I saw that DDR training was extremly slow. Anyway when building A3720 firmware code, you should use upstream instruction and not some from espressobin website which use ten years old code. See: https://trustedfirmware-a.readthedocs.io/en/latest/plat/marvell/armada/build.html But the best what you can do with DDR4 is to contact your Marvell FAE. Or use DDR3, I had less problems with DDR3 on A3720.
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines