Feature/qgpu float #48

goodstudyqaq · 2025-11-12T08:29:09Z

No description provided.

goodstudyqaq · 2025-11-12T08:32:57Z

Benchmark

Copilot

Pull Request Overview

This PR converts floating-point precision from double to float throughout the QGPU codebase to optimize GPU performance, as GPUs typically perform better with single-precision arithmetic.

Key Changes:

Converted function signatures, struct members, and variable declarations from double to float across all core modules
Updated CUDA kernel functions and device code to use float precision
Modified memory allocation calls to use sizeof(float) instead of sizeof(double)

Reviewed Changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
src/core/utils.h	Updated utility function signatures to use `float` parameters
src/core/utils.cu	Converted utility function implementations to `float`
src/core/system.h	Changed struct definitions and global variable declarations to `float`
src/core/system.cu	Updated system variables, functions, and added debug output for hot atoms
src/core/solvent.h	Modified solvent interaction function signatures to use `float`
src/core/solvent.cu	Converted solvent force calculations and GPU kernels to `float` precision
src/core/shake.cu	Updated SHAKE constraint algorithm to use `float`
src/core/restraints.cu	Converted restraint force calculations to `float`, changed numerical threshold
src/core/qatoms.h	Updated Q-atom function signatures to `float`
src/core/qatoms.cu	Converted Q-atom force calculations and GPU kernels to `float`
src/core/parse.cu	Changed lambda array allocation to `float`, refactored excluded atoms file reading
src/core/nonbonded.h	Updated non-bonded interaction function signatures to `float`
src/core/nonbonded.cu	Converted P-P interaction calculations to `float` precision
src/core/bonded.h	Modified bonded force function return types to `float`
src/core/bonded.cu	Converted bonded force calculations to `float`

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-12T08:33:21Z

src/core/utils.cu


-double to_radians(double degrees) {
+float to_radians(float degrees) {
    return degrees * (M_PI / 180.0);


Same issue as line 18: literals M_PI and 180.0 should use float versions to avoid double arithmetic.

Copilot · 2025-11-12T08:33:22Z

src/core/restraints.cu

            if (cos_th < -1) cos_th = -1;
            f0 = sin(acos(cos_th));
-            if (abs(f0) < 1.0E-12) f0 = 1.0E-12;
+            if (abs(f0) < 1.0E-6) f0 = 1.0E-6;


The threshold has been changed from 1.0E-12 to 1.0E-6. This is a significant change in numerical precision (6 orders of magnitude) that goes beyond the double-to-float conversion. Since float typically has ~7 decimal digits of precision, a threshold of 1.0E-6 might be appropriate, but this change should be intentional and documented. If this is just meant to be a type conversion, consider using 1.0E-12f instead to maintain the original threshold value (though values this small may lose meaning with float precision).

Suggested change

if (abs(f0) < 1.0E-6) f0 = 1.0E-6;

if (abs(f0) < 1.0E-12f) f0 = 1.0E-12f;

Copilot · 2025-11-12T08:33:22Z

src/core/parse.cu

+// --- dynamically read a full line ---
+    char *line = NULL;
+    size_t len = 0;
+    ssize_t read_len = getline(&line, &len, fp);
+    if (read_len == -1) {
+        fprintf(stderr, "Error: failed to read line from %s\n", path);
+        free(line);
+        fclose(fp);
+        exit(EXIT_FAILURE);
+    }

-    if (fgets(line, 8192, fp)) {
-        for (int i = 0; i < n_atoms; i++) {
-            bool excl = (line[i] == '1');
-            excluded[i] = excl;
-            if (excl) {
-                n_excluded++;
-            }
-        }
+    // --- parse ---
+    n_excluded = 0;
+    for (int i = 0; i < n_atoms && i < read_len; i++) {
+        bool excl = (line[i] == '1');
+        excluded[i] = excl;
+        if (excl) n_excluded++;
    }

+    printf("Number of excluded atoms: %d\n", n_excluded);
+
+    free(line);
    fclose(fp);
 }


The init_excluded function has been significantly refactored to use dynamic line reading with getline() instead of a fixed buffer. While this is a good improvement for handling files with many atoms, this change goes beyond the scope of the float conversion PR. Consider:

The loop condition i < read_len may be incorrect - it should likely be i < n_atoms only, since read_len includes the newline character

This functional change should ideally be in a separate PR focused on buffer safety improvements

Added printf on line 543 for debugging should be removed or made conditional

Copilot · 2025-11-12T08:33:22Z

src/core/utils.cu

+    v1 = ( (float)(rand()) + 1. )/( (float)(RAND_MAX) + 1. );
+    v2 = ( (float)(rand()) + 1. )/( (float)(RAND_MAX) + 1. );


When converting from double to float, literal constants should use the f suffix to avoid implicit double-to-float conversions. For example, 1. and 2. on lines 9-10 should be 1.f and 2.f. This ensures the compiler performs float arithmetic throughout rather than converting to double for the arithmetic operations and then back to float.

Suggested change

v1 = ( (float)(rand()) + 1. )/( (float)(RAND_MAX) + 1. );

v2 = ( (float)(rand()) + 1. )/( (float)(RAND_MAX) + 1. );

v1 = ( (float)(rand()) + 1.f )/( (float)(RAND_MAX) + 1.f );

v2 = ( (float)(rand()) + 1.f )/( (float)(RAND_MAX) + 1.f );

Copilot · 2025-11-12T08:33:23Z

src/core/utils.cu

-    v2 = ( (double)(rand()) + 1. )/( (double)(RAND_MAX) + 1. );
+    v1 = ( (float)(rand()) + 1. )/( (float)(RAND_MAX) + 1. );
+    v2 = ( (float)(rand()) + 1. )/( (float)(RAND_MAX) + 1. );
    nd10 = cos(2 * M_PI * v2) * sqrt(-2. * log(v1));


The constant literals -2. should be -2.f to avoid implicit double-to-float conversion and ensure consistent float arithmetic.

Suggested change

nd10 = cos(2 * M_PI * v2) * sqrt(-2. * log(v1));

nd10 = cos(2 * M_PI * v2) * sqrt(-2.f * log(v1));

Copilot · 2025-11-12T08:33:23Z

src/core/utils.cu


-double to_degrees(double radians) {
+float to_degrees(float radians) {
    return radians * (180.0 / M_PI);


The literals 180.0 and M_PI used in the conversion should be 180.0f. Note that M_PI is typically a double constant, so you may want to use M_PI with an explicit float cast: (float)M_PI or define a float version.

Copilot · 2025-11-12T08:33:23Z

src/core/system.cu

-    double mass_i;
+    float Temp_solute = 0, Tfree_solute = 0, Texcl_solute = 0;
+    float Tfree_solvent = 0, Temp_solvent = 0, Texcl_solvent = 0;
+    float Ekinmax = 1000.0 * Ndegf * Boltz * md.temperature / 2.0 / n_atoms;


The literal constant 1000.0 should be 1000.0f for consistent float arithmetic and to avoid unnecessary double-to-float conversions.

Suggested change

float Ekinmax = 1000.0 * Ndegf * Boltz * md.temperature / 2.0 / n_atoms;

float Ekinmax = 1000.0f * Ndegf * Boltz * md.temperature / 2.0 / n_atoms;

Copilot · 2025-11-12T08:33:24Z

src/core/parse.cu


+    printf("Number of excluded atoms: %d\n", n_excluded);


This printf statement appears to be debug output that should be removed before merging, or wrapped in a conditional compilation flag (e.g., #ifdef VERBOSE or #ifdef DEBUG).

Suggested change

printf("Number of excluded atoms: %d\n", n_excluded);

#ifdef DEBUG

printf("Number of excluded atoms: %d\n", n_excluded);

#endif

Copilot · 2025-11-12T08:33:24Z

src/core/system.cu

+            printf(">>> WARNING: hot atom %d, %f %f %f %f %f %f: %f %f %f %f %f\n", i, velocities[i].x, velocities[i].y, velocities[i].z, 
+            coords[i].x, coords[i].y, coords[i].z, ener, ener/Boltz/3, Ekinmax, mass_i, Ekinmax / (0.5 * mass_i));
+            // printf(">>> WARNING: hot atom %d: %f\n", i, ener/Boltz/3);
        }


[nitpick] This appears to be debug code that was uncommented during development. The extensive printf with many parameters (lines 732-734) and the commented-out original warning (line 734) suggest this was temporary debugging. Consider cleaning this up: either commit to the new detailed warning format or revert to the simpler original message.

Copilot · 2025-11-12T08:33:24Z

src/core/system.cu

+            printf(">>> WARNING: hot atom %d, %f %f %f %f %f %f: %f %f %f %f %f\n", i, velocities[i].x, velocities[i].y, velocities[i].z, 
+            coords[i].x, coords[i].y, coords[i].z, ener, ener/Boltz/3, Ekinmax, mass_i, Ekinmax / (0.5 * mass_i));
+            // printf(">>> WARNING: hot atom %d: %f\n", i, ener/Boltz/3);
        }


[nitpick] Similar to the solute section above, this appears to be debug code with the extensive printf and commented-out original warning. The same cleanup recommendation applies.

goodstudyqaq · 2025-11-12T08:33:44Z

Average energy of the last 10000 steps for 30A water ball.

goodstudyqaq · 2025-11-12T08:40:09Z

Stable Test

`[Run 1/20] Setting random_seed=1
[Run 1/20] Launching qdyn … [Run 1/20] Tail average over last 10000 Utot entries: -23262.058859
[Run 2/20] Setting random_seed=2
[Run 2/20] Launching qdyn …
[Run 2/20] Tail average over last 10000 Utot entries: -23207.679402
[Run 3/20] Setting random_seed=3
[Run 3/20] Launching qdyn …
[Run 3/20] Tail average over last 10000 Utot entries: -23242.715745
[Run 4/20] Setting random_seed=4
[Run 4/20] Launching qdyn …
[Run 4/20] Tail average over last 10000 Utot entries: -23221.516130
[Run 5/20] Setting random_seed=5
[Run 5/20] Launching qdyn …
[Run 5/20] Tail average over last 10000 Utot entries: -23215.353644
[Run 6/20] Setting random_seed=6
[Run 6/20] Launching qdyn …
[Run 6/20] Tail average over last 10000 Utot entries: -23208.358715
[Run 7/20] Setting random_seed=7
[Run 7/20] Launching qdyn …
[Run 7/20] Tail average over last 10000 Utot entries: -23222.405878
[Run 8/20] Setting random_seed=8
[Run 8/20] Launching qdyn …
[Run 8/20] Tail average over last 10000 Utot entries: -23218.269533
[Run 9/20] Setting random_seed=9
[Run 9/20] Launching qdyn …
[Run 9/20] Tail average over last 10000 Utot entries: -23228.857007
[Run 10/20] Setting random_seed=10
[Run 10/20] Launching qdyn …
[Run 10/20] Tail average over last 10000 Utot entries: -23209.624296
[Run 11/20] Setting random_seed=11
[Run 11/20] Launching qdyn …
[Run 11/20] Tail average over last 10000 Utot entries: -23226.275011
[Run 12/20] Setting random_seed=12
[Run 12/20] Launching qdyn …
[Run 12/20] Tail average over last 10000 Utot entries: -23225.276748
[Run 13/20] Setting random_seed=13
[Run 13/20] Launching qdyn …
[Run 13/20] Tail average over last 10000 Utot entries: -23233.839956
[Run 14/20] Setting random_seed=14
[Run 14/20] Launching qdyn …
[Run 14/20] Tail average over last 10000 Utot entries: -23178.009075
[Run 15/20] Setting random_seed=15
[Run 15/20] Launching qdyn …
[Run 15/20] Tail average over last 10000 Utot entries: -23240.068374
[Run 16/20] Setting random_seed=16
[Run 16/20] Launching qdyn …
[Run 16/20] Tail average over last 10000 Utot entries: -23197.300882
[Run 17/20] Setting random_seed=17
[Run 17/20] Launching qdyn …
[Run 17/20] Tail average over last 10000 Utot entries: -23218.422119
[Run 18/20] Setting random_seed=18
[Run 18/20] Launching qdyn …
[Run 18/20] Tail average over last 10000 Utot entries: -23221.814738
[Run 19/20] Setting random_seed=19
[Run 19/20] Launching qdyn …
[Run 19/20] Tail average over last 10000 Utot entries: -23249.873881
[Run 20/20] Setting random_seed=20
[Run 20/20] Launching qdyn …
[Run 20/20] Tail average over last 10000 Utot entries: -23228.768666

Summary across runs

Runs completed: 20
Average of per-run tail averages: -23222.824433
Standard error: 4.150652
No '>>> WARNING' lines detected in any run. `

goodstudyqwq added 2 commits November 7, 2025 09:54

add log for debug

cb2672b

change double to float

7d1cc82

goodstudyqaq requested a review from Copilot November 12, 2025 08:29

Copilot started reviewing on behalf of goodstudyqaq November 12, 2025 08:30 View session

Copilot finished reviewing on behalf of goodstudyqaq November 12, 2025 08:31

Copilot AI reviewed Nov 12, 2025

View reviewed changes

add comment

9942e89

	if (abs(f0) < 1.0E-6) f0 = 1.0E-6;
	if (abs(f0) < 1.0E-12f) f0 = 1.0E-12f;

		v1 = ( (float)(rand()) + 1. )/( (float)(RAND_MAX) + 1. );
		v2 = ( (float)(rand()) + 1. )/( (float)(RAND_MAX) + 1. );

	nd10 = cos(2 * M_PI * v2) * sqrt(-2. * log(v1));
	nd10 = cos(2 * M_PI * v2) * sqrt(-2.f * log(v1));

	float Ekinmax = 1000.0 * Ndegf * Boltz * md.temperature / 2.0 / n_atoms;
	float Ekinmax = 1000.0f * Ndegf * Boltz * md.temperature / 2.0 / n_atoms;

Feature/qgpu float #48

Are you sure you want to change the base?

Feature/qgpu float #48

Uh oh!

Conversation

goodstudyqaq commented Nov 12, 2025

Uh oh!

goodstudyqaq commented Nov 12, 2025

Benchmark

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

goodstudyqaq commented Nov 12, 2025

Average energy of the last 10000 steps for 30A water ball.

Uh oh!

goodstudyqaq commented Nov 12, 2025

Stable Test

Summary across runs

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants