This CLI can be used to preprocess the data and get the proper data and format for the backend. This process downloads the data, preprocesses the data, saves the needed particle properties and generates the corresponding octree.
Download this repo Python >= 3.10
make venv
source venv/bin/activate # to activate the virtual environment
export TNG_TOKEN="..."
tng-sv-cli web download --simulation-name TNG50-4 --snapshot-idx NR
tng-sv-cli web preprocess --simulation-name TNG50-4 --snapshot-idx NR
tng-sv-cli web batch-download --simulation-name TNG50-4 --snapshot-idx NR
tng-sv-cli web batch-preprocess --simulation-name TNG50-4 --snapshot-idx NR
PYTHONPATH=. fastapi dev webScripts/api/backend.py --host 0.0.0.0 --port 9999
Go into frontend repository. Maybe adjust src/index.ts:139 const url to backend ip and port
npm install
npm run start
In order to use the application the backend needs access to preprocessed data.
This data is generated by the preprocessing pipeline, which can be started via
the tng-sv-cli.
As it relates to the web application we can find this preprocessing command
under the web subcommand. Additionally, one has to set the API token, required
to download the data. Given the data is already present, one can also set the
--data-path flag.
To see which flags exists use:
$ tng-sv-cli web preprocess --help # For preprocessing one snapshot
$ tng-sv-cli web batch-preprocess --help # For preprocessing multiple snapshots
Not so straight forward parameters are:
--filter-out-percentage, which allows to only pre-process a certain percentage of the data, sorted by max value--data-path, which allows to use already present data on the machine, if e.g. the CLI is executed on a host from the IllustrisTNG project
- Download snapshot n and n+1: To pre-process a snapshot, meaning an instant in time, the tool downloads the current snapshot and the next one. As we interpolate the trajectory from one particle at two time frames we require the positions at time n and n+1.
- Filter out a specific percentage: Allows to filter out a certain percentage of data points based on a certain property ordered by size, e.g. Density
- Check which particles are contained in both snapshots
- Generate an Octree based on the particles which we can interpolate:
The octree contains the offsets to the splines, which are stored in an extra
array. This is done for pragmatic performance reasons. The octree serializes
to json, while saving the extra array can be done using numpy arrays. Which
makes reading and accessing faster compared to dumping everything as a json
file.
- We use the Python API of the C++ Open3D implementation of an Octree
- For the spline calculation we refactored the CubicHermiteSpline of SciPy, s.t. we can use numba's JIT feature in order to speed up the pipeline.
- Calculate further properties:
- Attribute Quantiles: allows a better user experience when filtering out certain quantiles of an attribute in the
- Voronoi diameter: Is an approximation of the size of certain voronoi cells using the known density and mass of a cell
- Save data to the disk
CameraInformation:
- contains coordinates and size of the client's camera
ClientState:
- saves the state for already loaded node indices, level of detail, batch size and percentage of data per leaf node in order to load data dynamically batch-wise
DataCache:
- class responsible for loading data efficiently:
- checks if a snap is already loaded, if not it fetches it from the server's filesystem
- keeps loaded data on server cache
This endpoint initializes the visualization by providing metadata (number of quantiles and their data, all available snapshots and BoxSize) and fetching the initial simulation data.
How does it work?
- scans directories to identify available snapshots (
all_possible_snaps) by matching folder names with the pattern snapdir_ - extracts box size (
BoxSize) metadata from files matching the patterngroups_using the illustris library - uses the
DataCacheclass to check if the requested simulation and snapshot data (simulation,snap_id) is already cached - if cached, it retrieves the data directly. Otherwise, it:
- loads several data files (
splines,velocities,densities, etc) and structures from the filesystem based on the simulation and snapshot - prepares a
ListOfLeafsobject fromleafsandleafs_scanarrays - calculates density quantiles using the densities data
- caches the loaded data
- loads several data files (
What is the response? The response is a JSON that includes:
density_quantiles: A list of quantile values derived from the density data.n_quantiles: The number of quantiles available.available_snaps: A list of all possible snapshot numbers for the simulation.BoxSize: The size of the simulation box.
This endpoint processes and retrieves spline data along with related information for a specific simulation and snapshot, filtered based on the client's camera view.
How does it work?
- retrieves cached data for the specified simulation and snapshot (
simulation,snap_id) using theDataCacheclass. This data includes:octree: For spatial hierarchy and node traversalsplines: Cubic spline parametersvelocities, densities, coordinates, voronoi_diameter_extendedparticle_list_of_leafs: Data structure that maps particles to octree leaf nodes
- traverses octree:
- uses the client's camera position (from
CameraInformation) to create aViewBox, representing the region of interest in 3D space - traverses the octree to find intersecting nodes containing relevant particles (
node_indices).
- uses the client's camera position (from
- filters and loads particles for each intersecting node:
- retrieves particle IDs from
particle_list_of_leafsbased on the percentage of data (client_state.percentage) and level of detail (LOD) - adjusts the range of particles per node based on
batch_size_lod
- retrieves particle IDs from
- increases level of detail:
- updates the LOD for each node in the client state, ensuring the detail increases with each call
- extract relevant data:
- Splines: Extracts spline parameters (
splines_a,splines_b,splines_c,splines_d) - Physical properties: Coordinates, velocities, densities, Voronoi diameters
- Calculates the minimum and maximum densities for the selected particles
- Splines: Extracts spline parameters (
What is the response? The response is a JSON that includes:
- Data:
- Relevant particle IDs (
relevant_ids). - Coordinates, velocities, densities, splines, and Voronoi diameters for the selected particles
- Relevant particle IDs (
- Metadata:
- Updated
level_of_detailfor the nodes - Density range (
min_density,max_density) - Total number of particles (
nParticles) - Density quantiles, snapshot ID (
snapnum)
- Updated
Structure:
- the octree starts with a root node that represents the entire bounding box (the space of interest).
- each node is recursively subdivided into eight smaller cubical regions (children), dividing the space into octants.
- subdivision continues up to a maximum depth or until each node contains fewer than a specified number of points (or other criteria are met).
Storage of Data:
- particles are stored in the leaf nodes. If a node contains more particles than the allowed threshold (
size_per_leaf), it is further subdivided. - leaf node stores data like particle indices and values of relevant fields
Traversal:
- queries or operations (e.g. finding neighbors or retrieving data) involve traversing the octree from the root, descending into relevant nodes based on the spatial location of interest.
Download is triggered as soon as one of the following premises is met:
- download of the current ViewBox is finished for current percentage and LOD
- a leaf inside the ViewBox has a higher LOD than the current still with particles in it
- the particles of the current LOD from every leaf are downloaded and then the LOD is increased
- the latest LOD which was loaded is saved so that changing the ViewBox will continue downloading at the last LOD which was not loaded yet per leaf
Originally writted by Nicolas Bender, Marc Burg, and Jonannes Maul as part of a research project at Heidelberg University.
Supervised by Dylan Nelson and Filip Sadlo.
The write-up of the project is available as a PDF in this repository.