Jump to content

tkaiser

Members
  • Posts

    5462
  • Joined

Everything posted by tkaiser

  1. I don't know how to interpret that correctly since your ping test is some sort of flooding, isn't it? Regarding performance: As already mentioned yesterday evening I also let my Banana Pi run through the tests. The best settings for TX/RX delay were 3/0 (defaults): TX 3, RX 0: TX: 700 Mbits/sec, 5000 packets transmitted, 3593 received, 28% packet loss, time 27440ms rtt min/avg/max/mdev = 0.364/0.534/0.834/0.051 ms RX: 800 Mbits/sec, 5000 packets transmitted, 5000 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 0.363/0.568/0.753/0.045 ms TX 3, RX 2: TX: 650 Mbits/sec, 5000 packets transmitted, 3583 received, 28% packet loss, time 27449ms rtt min/avg/max/mdev = 0.369/0.520/0.972/0.062 ms RX: 700 Mbits/sec, 5000 packets transmitted, 4999 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 0.362/0.542/0.687/0.067 ms TX 1, RX 0: TX: 680 Mbits/sec, 5000 packets transmitted, 3596 received, 28% packet loss, time 27386ms rtt min/avg/max/mdev = 0.374/0.530/0.892/0.058 ms RX: 830 Mbits/sec, 5000 packets transmitted, 5000 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 0.362/0.557/1.713/0.057 ms TX 1, RX 4: TX: 650 Mbits/sec, 5000 packets transmitted, 0 received, 100% packet loss, time 50824ms RX: 30.8 Kbits/sec, 5000 packets transmitted, 0 packets received, 100.0% packet loss TX 2, RX 2: TX: 640 Mbits/sec, 5000 packets transmitted, 3592 received, 28% packet loss, time 27469ms rtt min/avg/max/mdev = 0.366/0.536/0.878/0.059 ms RX: 800 Mbits/sec, 5000 packets transmitted, 4999 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 0.364/0.569/0.710/0.037 ms The RX ping reports are from OS X 10.9.5 on a MacBook Pro. In TX direction more lost packets but also more throughput compared to Lime2.
  2. I did also the tests in between. There are just 2 combinations that work somewhat reliable with my Lime2: 0/0: 600 Mbits/sec TX, 370 Mbits/sec RX, packet losses: 41% 27.0% (TX/RX) 3/0: 580 Mbits/sec TX, 355 Mbits/sec RX, packet losses: 39% 25.5% (TX/RX) The results vary a lot which seems to be a case for some sort of mismatch. Right now I'm testing a Banana Pi (the 'old' model, not the Pro/M1+) and with its default delay setting (3/0 and performance governor with 960 MHz and DRAM clocked with 480 MHz) it looks like: TX: 700 Mbits/sec, 5000 packets transmitted, 3593 received, 28% packet loss, time 27440ms rtt min/avg/max/mdev = 0.364/0.534/0.834/0.051 ms RX: 800 Mbits/sec, 5000 packets transmitted, 5000 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 0.363/0.568/0.753/0.045 ms
  3. Interesting. But TX4/RX0 seems to be even better? I will test with my Lime2 maybe this evening or tomorrow. I just let another automated test with the Lamobo R1 run (using 3 times a 10 sec iperf run and using the performance CPU governor): http://pastebin.com/VzWxpepX The results seem to be on Lamobo R1: TX delays 0, 1 and 2 don't work at all. TX throughput maxes out at 370 Mbits/sec (100% CPU utilisation @ 960 MHz) RX throughput maxes out at 460 Mbits/sec (100% CPU utilisation @ 960 MHz) Manipulating RX delay does matter regarding TX throughput clock speed as well as cpufreq settings (especially governor + scaling_min_freq) directly influence performance/benchmarks Really looking forward to test with Lime2 or Banana Pi. Maybe I'll increase possible clock speeds from 960 MHz to 1008 or even 1200 MHz in arch/arm/boot/dts/sun7i-a20.dtsi when results of the combination GMAC+RTL8211 look promising to get closer to the limits.
  4. Testing done with Lamobo R1 since it's useless. I tried a few different combinations of RX delays (all with TX_DELAY 4): RX 0: 458 Mbits/sec, 25% packet loss RX 1: 458 Mbits/sec, 25% packet loss RX 2: 456 Mbits/sec, 25% packet loss RX 4: 457 Mbits/sec, 25% packet loss RX 6: 455 Mbits/sec, 25% packet loss The results are identical and the obvious reason is that iperf is CPU bound and one core always spent 100% utilisation on the iperf server thread in question. So while different RX delay settings still might make a difference they won't show any practical difference due to the CPU being the bottleneck (or the stuff I patched does not work ). Time to test again with a Banana Pi or Lime2 which use a somewhat different driver framework. I always expierenced 25% packet loss when pinging the directly connected MacBook using ping -s 9000 -i 0.0001 -c 5000 macbook.local BTW: This is the test script I used. Set up prerequisits like in the comment outlined and then call it from /etc/rc.local. Will exchange u-boot 64 times, reboots afterwards and tests with the newly applied settings: root@lamobo:~# cat /usr/local/bin/gmac-delay-test.sh #!/bin/bash # # gmac-delay-test.sh # # to revert: # rm /root/stop ; echo -n 0 >/root/rx ; echo -n 0 >/root/tx ; dpkg -i /root/uboot/linux-u-boot-lamobo-r1_3.1_0_0_armhf.deb ; shutdown -r now if [ -f /root/stop ]; then exit 0 fi Main() { read rx </root/rx read tx </root/tx echo "testing tx: ${tx}, rx: ${rx}" if [ $rx -eq 8 -a $tx -eq 0 ]; then touch /root/stop exit 0 else DoTest if [ ${tx} -eq 7 ]; then tx=0 rx=$(( ${rx} + 1 )) else tx=$(( ${tx} + 1 )) fi echo -n ${tx} >/root/tx echo -n ${rx} >/root/rx dpkg -i /root/uboot/linux-u-boot-lamobo-r1_3.1_${tx}_${rx}_armhf.deb sync shutdown -r now fi } # Main DoTest() { Testlog=/root/u-boot-test.txt ping -c 1 -W 1 MacBook.local >/dev/null 2>/dev/null case $? in 0) MbitTX=$(iperf -c 192.168.83.75 -t 30 | awk -F" " '/Mbits/ {print $7}') echo -e "$(date)\t$tx\t$rx\t${MbitTX}" >>"${Testlog}" ;; *) echo -e "$(date)\t$tx\t$rx\t-" >>"${Testlog}" ;; esac } # DoTest Main
  5. Next observation: Without adjusting the cpufreq stuff measuring anything related to performance is just crap. The default settings with 4.x scale the CPU frequency between 144 and 960 MHz and ondemand governor. So if you start a test when the CPU has been idle for a while you will get completely different results since the CPU will stay a few seconds in its lowest speed state before clocking up: root@lamobo:/sys/devices/system/cpu/cpu0/cpufreq# cat stats/time_in_state 144000 22814 312000 1807 528000 612 720000 784 864000 302 912000 76 960000 21377 Without something like echo performance >/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor applied prior to testing you can expect random results. So I put that now into /etc/rc.local for the tests (since they showed exactly what's to be expected: First iperf run after board being idle: 100 MBits/sec less compared to the consequent tests when the CPU has been clocked with 960 MHz).
  6. Ok, I tried all 64 combinations using the 64 created u-boot debian packages with different TX/RX delay settings. Results of a single iperf TX run to a directly connected MacBook Pro here: http://pastebin.com/xqJN5Kpp(Igor's default settings with eth0 IRQs dedicated to cpu1 -- no further tuning applied). What seems to be obvious: When TX delay is set to 0, 1 or 2 then no network connection can be established at all. The results measured might depend on other stuff like load and do not show any real difference regarding different TX or RX delay settings (one probable exception: RX delay set to 5). When running the tests iperf utilized one single CPU core to 100% so there might be the chance that different TX/RX delay settings make a real difference but you won't measure this due to driver problems: root@lamobo:~# time iperf -c 192.168.83.75 -t 30 ------------------------------------------------------------ Client connecting to 192.168.83.75, TCP port 5001 TCP window size: 43.8 KByte (default) ------------------------------------------------------------ [ 3] local 192.168.83.82 port 45384 connected with 192.168.83.75 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-30.0 sec 1.18 GBytes 336 Mbits/sec real 0m30.067s user 0m0.150s sys 0m28.850s (nearly all time spent in sys). The overall CPU utilisation when running the test was around 130%-150% so there's not that much room for improvement. Will try next with an Olimex Lime2 and do in the meantime some iperf tests in the other direction with TX delay 4 and different RX delays. BTW: clock speeds matter. I used the default settings (maximum operating point in 4.0.4 960 MHz per default) and 'verified' using sysbench --test=cpu --cpu-max-prime=5000 run --num-threads=2 Since execution time was around 54.5 secs the kernel might clocked up to 960 MHz after a short period of time (on Kernel 3.4 with 912 MHz and CPU governor performance the very same test finishes in 55-56 secs)
  7. Nearly all the stuff around the horrible power situation of the Lamobo R1 can be found in this otherwise pretty useless and crappy forum: http://bananapi.com/index.php/forum/general/391-why-the-sata-disk-doesnt-work-on-bpi-r1?start=12 I would have a look for undervoltage issues (very likely). @Patcher: If you power the board using the LiPo socket does a connected HDD/SSD still work? And how do you solved the mechanical challenge to insert a plug into the LiPo connector and also use a disk (bending the connector?). JFR: I used the board with an older image (3.4.106 or even older) and both a connected 2.5" HDD and a HDMI display. Since the AXP209 also has to power the disk on the Lamobo R1 you can simply read out the power requirements using sysfs. And due to the crappy Micro USB connector it's not possible to boot the board when a power hungry USB keyboard and mouse were also connected (peak consumption when trying to spin up the disk exceeded the overall power maximum). Without the USB peripherals it worked even with unpatched u-boot (rootfs on SATA). And when using the stress utility to produce some load the consumption of the board sometimes reached 9V (maximum since Micro USB allows 5V/1.8A max.) And never ever use the original acrylic enclosure especially lying flat around. Both disk and the AXP209 power management unit might overheat easily due to bad placement and no airflow possible.
  8. Sorry, no time to test (been busy in the kitchen). But if you want to give different TX/RX delay parameters a try you could play with the 64 different u-boot packages my script created: http://kaiser-edv.de/tmp/lime2-u-boot.tgz Contains 64 debs with the following name scheme: linux-u-boot-lime2_1.9_$TX_$RX_armhf.deb. To use an unmodified RX delay and eg. 4 as TX delay simply install dpkg -i linux-u-boot-lime2_1.9_4_0_armhf.deb (I completely rely on Igor's compile_uboot function and no testing has been done regarding RX delay! Be warned: you might end up with a corrupted bootloader!). JFR: Another u-boot patch was necessary: diff --git a/board/sunxi/Kconfig b/board/sunxi/Kconfig index 2fcab60..4623de6 100644 --- a/board/sunxi/Kconfig +++ b/board/sunxi/Kconfig @@ -451,4 +451,10 @@ config GMAC_TX_DELAY ---help--- Set the GMAC Transmit Clock Delay Chain value. +config GMAC_RX_DELAY + int "GMAC Receive Clock Delay Chain" + default 0 + ---help--- + Set the GMAC Reveice Clock Delay Chain value. + endif Both u-boot patches combined for GMAC RX DELAY: http://pastebin.com/adiWjzya Regarding the problems you experience: Do the Lime2 showing problems have a different board revision? I can not remember having packet losses with my Lime2. Unfortunately I cannot test immediately since I have to prefer the Lamobo-R1 that is dedicated to a customer and shows really bad performance right now. Maybe on thursday I'll have a look.
  9. Well, I doubt that TX DELAY is related to the problems you experience (these delay settings should only matter when you compare different boards or board revisions). But anyway. Now I have a patchset to create 64 u-boot variants with all possible TX/RX delay variations and will give them later a try (with Lamobo-R1 first, will try the very same stuff with my Lime2 if I suceed with testing and this brute-force approach shows good results). I just added two more lines to gmac.c and hope that they will work: diff --git a/board/sunxi/gmac.c b/board/sunxi/gmac.c index 8849132..1bce3ce 100644 --- a/board/sunxi/gmac.c +++ b/board/sunxi/gmac.c @@ -26,6 +26,8 @@ int sunxi_gmac_initialize(bd_t *bis) CCM_GMAC_CTRL_GPIT_RGMII); setbits_le32(&ccm->gmac_clk_cfg, CCM_GMAC_CTRL_TX_CLK_DELAY(CONFIG_GMAC_TX_DELAY)); + setbits_le32(&ccm->gmac_clk_cfg, + CCM_GMAC_CTRL_RX_CLK_DELAY(CONFIG_GMAC_RX_DELAY)); #else setbits_le32(&ccm->gmac_clk_cfg, CCM_GMAC_CTRL_TX_CLK_SRC_MII | CCM_GMAC_CTRL_GPIT_MII); The diff for modifications of Igor's scripts are here: http://pastebin.com/ZMF89Y57 and you still would need to define 'GMAC_DELAY_TEST="yes"' in compile.sh
  10. BTW: Since I want to play around with both TX and RX delay parameters I started to modify Igor's build scripts for this purpose. Since the build system creates a .deb package for u-boot that overwrites the SPL/u-boot on SD card my idea was to let create all 64 variants of possible TX/RX delay values and then let a script automatically install the different u-boot variants, reboot afterwards, test network performance using ping/iperf, tries the next u-boot.deb until all combinations are tested. Currently I got the first step finished: Automatically creating 8 different u-boot .deb packages by adjusting Igor's script. 1) Add GMAC_DELAY_TEST="yes" to compile.sh 2) Modify one line in lib/common.sh. Exchange CHOOSEN_UBOOT="linux-u-boot-$VER-"$BOARD"_"$REVISION"_armhf" with CHOOSEN_UBOOT="linux-u-boot-${VER}-${BOARD}_${REVISION}_${TX}_${RX}_armhf" 3) Add the following lines in lib/main.sh after "grab_kernel_version" # check whether we should just build a bunch of u-boot versions to # brute-force all available GMAC TX/RX delay variations. if [ "X${GMAC_DELAY_TEST}" == "Xyes" ]; then for TX in 0 1 2 3 4 5 6 7 ; do for RX in 0 ; do CHOOSEN_UBOOT="linux-u-boot-${VER}-${BOARD}_${REVISION}_${TX}_${RX}_armhf" UBOOT_PCK="linux-u-boot-$VER-"$BOARD # search defconfig file for $BOARD Defconfig="$(grep -i -- "-${BOARD}.dtb" ${DEST}/u-boot/configs/*_defconfig | cut -d: -f1)" if [ ! -f "${Defconfig}" ]; then case ${BOARD} in aw-som-a20) Defconfig="${DEST}/u-boot/configs/Awsom_defconfig" ;; cubieboard2) Defconfig="${DEST}/u-boot/configs/Cubieboard2_defconfig" ;; lime) Defconfig="${DEST}/u-boot/configs/A20-OLinuXino-Lime_defconfig" ;; micro) Defconfig="${DEST}/u-boot/configs/A20-OLinuXino_MICRO_defconfig" ;; bananapipro) Defconfig="${DEST}/u-boot/configs/Bananapro_defconfig" ;; udoo*) Defconfig="${DEST}/u-boot/configs/udoo_quad_defconfig" ;; esac fi # patch defconfig with appropriate tx/rx values MyTmpFile="$(mktemp /tmp/gmac_test.XXXXXX || exit 1)" trap "cd /tmp; rm -f \"${MyTmpFile}\"; exit 0" 0 1 2 3 15 grep -v CONFIG_GMAC_TX_DELAY "${Defconfig}" | grep -v CONFIG_GMAC_RX_DELAY >"${MyTmpFile}" cat "${MyTmpFile}" >"${Defconfig}" echo -e "CONFIG_GMAC_TX_DELAY=${TX}\nCONFIG_GMAC_RX_DELAY=${RX}" >>"${Defconfig}" compile_uboot done done exit 0 fi Step 2) and 3) as patch: http://pastebin.com/RC5WptYB After execution of compile.sh you will end up with 8 different u-boot.debs in output/output/u-boot containing different GMAC TX DELAY settings that can be installed using "dpkg -i". Execution will stop afterwards. Next step is to patch gmac.c so that different definitions of "CONFIG_GMAC_RX_DELAY" might lead to different results.
  11. You can set the register values from within u-boot directly if you've a serial connection and stop autoboot: https://groups.google.com/d/msg/linux-sunxi/Y_Zh5juEJG4/egjJojeTgS8J (but I've no clue why he just tried 0x006, 0x406, 0x806 and 0xc06 and how this should 'translate' to 0, 1, 2 and 3? If we're speaking about 3 bits then 8 -- 0 to 7 -- different values should be possible?)
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines