Skip to content

Conversation

@mchandra
Copy link
Contributor

@mchandra mchandra commented Oct 9, 2017

Transfer from af<->np<->petsc without reordering

Initial testing:

$ AF_OPENCL_DEFAULT_DEVICE_TYPE=CPU python test_af_np_petsc_data_transfer.py 
ArrayFire v3.6.0 (OpenCL, 64-bit Linux, build d9bc8d7)
-0- NVIDIA: Quadro M1000M, 2047 MB
[1] INTEL: Intel(R) Xeon(R) CPU E3-1505M v5 @ 2.80GHz, 31993 MB
---------------------
N_q1 + 2*N_ghost = 70
N_q2 + 2*N_ghost = 134
dof              = 4096
---------------------

af_array_old.shape =  (70, 134, 4096)
af_array_new.shape =  (4096, 70, 134)
 
comm_old =  2.8967 secs/iter
comm_new =  0.8536 secs/iter
 

shyams2 and others added 7 commits September 22, 2017 11:08
…add additional

headers in test files for petsc to read command line arguments.
2) test_compute_electrostatic_fields.py is in flux

Run with : python test_compute_electrostatic_fields.py -ksp_monitor
* First attempt: managed to solve x^2 - 2 = 0 over [63, 63] grid using SNES
  * Periodic BCs work once background density has been subtracted correctly
@mchandra mchandra changed the title Perf boost Perf boost [WIP] Oct 9, 2017
@shyams2
Copy link
Contributor

shyams2 commented Oct 10, 2017

Results when run on savio:

ArrayFire v3.6.0 (OpenCL, 64-bit Linux, build 4a60571)
[0] NVIDIA: Tesla K80, 11439 MB
-1- INTEL: Intel(R) Xeon(R) CPU E5-2623 v3 @ 3.00GHz, 64449 MB
-2- AMD: Intel(R) Xeon(R) CPU E5-2623 v3 @ 3.00GHz, 64449 MB
---------------------
N_q1 + 2*N_ghost = 70
N_q2 + 2*N_ghost = 134
dof              = 4096
---------------------

af_array_old.shape =  (70, 134, 4096)
af_array_new.shape =  (4096, 70, 134)
 
comm_old =  0.7544 secs/iter
comm_new =  0.2407 secs/iter

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants