Jump to content

katonar

Validating
  • Posts

    1
  • Joined

  • Last visited

  1. I am trying to build a compute cluster from Odroid N2. But seems like the core 5-6 does not speed-up the application at all. And the speed up is also not linear along the cores. Here are some measures from my MPI based application. I measured a total calculation time for a function. 1 core 0.838052sec 2 core 0.438483sec 3 core 0.405501sec 4 core 0.416391sec 5 core 0.514472sec 6 core 0.435128sec 12 core (4 core from 3 N2 boards) 0.06867sec 18 core (6 core from 3 N2 boards) 0.152759sec I am using the Armbian Focal mainline kernel 5.6.y image. Does it need any special configuration to use the Dual-Core Cortex-A53 CPU? Is there any response time or syncronization time which shall be considered when using the Dual-Core Cortex-A53 CPU and using MPI? nproc says there are 6 cores available. int MyFun(int *array, int num_elements, int j) { int result_overall = 0; for (int i = 0; i < num_elements; i++) { result_overall += array[i] / 1000; } return result_overall; } int compute_sum(int* sub_sums,int num_of_cpu) { int sum = 0; for(int i = 0; i<num_of_cpu; i++) { sum += sub_sums[i]; } return sum; } //measuring performance from main(): if (world_rank == 0) { startTime = std::chrono::high_resolution_clock::now(); } // Compute the sum of your subset int sub_sum = 0; for(int j=0;j<1000;j++) { sub_sum += MyFun(sub_intArray, num_elements_per_proc, world_rank); } MPI_Allgather(&sub_sum, 1, MPI_INT, sub_sums, 1, MPI_INT, MPI_COMM_WORLD); int total_sum = compute_sum(sub_sums, num_of_cpu); if (world_rank == 0) { elapsedTime = std::chrono::high_resolution_clock::now() - startTime; timer = elapsedTime.count(); }
×
×
  • Create New...

Important Information

Terms of Use - Privacy Policy - Guidelines