-
Notifications
You must be signed in to change notification settings - Fork 148
Fix memory leak in efield.cpp and optimize surchem module performance #6832
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
mohanchen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General speaking, you should divide 'bug fix' and 'refactor' into at least two separate PRs
| std::complex<double> *aux_G = new std::complex<double>[rho_basis->npw]; | ||
| double *aux_R = new double[rho_basis->nrxx]; | ||
|
|
||
| local_grad_rho(ucell, rho_basis, ps_totn, nablan, aux_G, aux_R); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not a good idea to invent a new function called 'local_grad_rho', for example, you don't add OPENMP acceleration code in the new 'local_grad_rho' function, and the developers need to maintain two sets of codes that share the similar functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
void XC_Functional::grad_rho(const std::complex* rhog,
ModuleBase::Vector3* gdr,
const ModulePW::PW_Basis* rho_basis,
const double tpiba)
{
std::complex *gdrtmp = new std::complex[rho_basis->nmaxgr];
// the formula is : rho(r)^prime = \int iG * rho(G)e^{iGr} dG
for(int i = 0 ; i < 3 ; ++i)
{
// calculate the charge density gradient in reciprocal space.
#ifdef _OPENMP
#pragma omp parallel for schedule(static, 1024)
#endif
for(int ig=0; ig<rho_basis->npw; ig++) {
gdrtmp[ig] = ModuleBase::IMAG_UNIT * rhog[ig] * rho_basis->gcar[ig][i];
}
// bring the gdr from G --> R
rho_basis->recip2real(gdrtmp, gdrtmp);
// remember to multily 2pi/a0, which belongs to G vectors.
#ifdef _OPENMP
#pragma omp parallel for schedule(static, 1024)
#endif
for(int ir=0; ir<rho_basis->nrxx; ir++) {
gdr[ir][i] = gdrtmp[ir].real() * tpiba;
}
}
delete[] gdrtmp;
return;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above the original grad_rho code, which has OPENMP acceleration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment. I agree that maintaining a duplicated local_grad_rho is not a good idea.
Although benchmarking showed that the shared XC_Functional::grad_rho is slightly slower (~3%) than the specialized local version, this performance cost is negligible compared to the benefits of code maintainability and avoiding redundancy. I will proceed to delete the local functions and use the unified interface.
The previous memory leak was caused by induced_rho not being deleted in source/source_estate/module_pot/efield.cpp.
While fixing this issue, I also refactored several function calls in the surchem module and minimized matrix allocations inside loops.
Now, it seems that the efficiency of the surchem module has been improved by approximately 100%, and the memory leak has been resolved.
I merely test two examples. One is the optimization of ethanol in water, namely the example in #6794, where the time for cal_vel decreased from 1607.60s to 772.56s. The other one is the scf calculation of a Ag slab with cooh (adapted from http://bbs.keinsci.com/thread-57694-1-1.html), where the time for cal_vel decreased from 749.44s to 375.28s.
test.zip
I am not a specialist in quantum mechanics; I am merely trying to explore possible solutions with the assistance of AI. Therefore, I would greatly appreciate it if the reviewer could take the time to review my PR. If my PR does indeed resolve this issue, it would be a great honor for me.
Reminder
Linked Issue
Fix #...#6794
Unit Tests and/or Case Tests for my changes
What's changed?
Any changes of core modules? (ignore if not applicable)