Skip to content

Condenser hanging -- debugging options? #22

@SimonGoring

Description

@SimonGoring

I've created a public gist with my config file and a link to a dump of the database I'm applying condenser against. The issue I'm running into is that condenser appears to hang (over 24hrs with no new text to screen on verbose mode), but I'm not sure how to debug the issue, or know whether or not anything is actually happening.

I'm running condenser as part of a broader workflow through a bash script:

#!/bin/bash
#
# A bash script that uses `condenser` to export a database subset to a database
# to a `localhost` database, and then dump the file and compress it into a tar
# file.
#
# Simon Goring - May 12, 2021
#

# First we check to see if the condenser files actually exist.
if [[ ! -f db_connect.py ]]
then
    echo "Condenser does not exist in the current directory."
    pip install toposort
    pip install psycopg2-binary
    pip install mysql-connector-python
    git clone --depth=1 git@github.com:TonicAI/condenser.git .
    rm -rf !$/.git
fi

# Clone the repo
#
# Remove the .git directory
#rm -rf !$/.git

export PGPASSWORD='DATABASE PASSWORD'
psql -h localhost -U postgres -c "CREATE DATABASE export;"
echo "SELECT 'DROP SCHEMA '||nspname||' CASCADE; CREATE SCHEMA '||nspname||';' FROM pg_catalog.pg_namespace WHERE NOT nspname ~ '.*_.*'" | \
    psql -h localhost -d export -U postgres -t | \
    psql -h localhost -d export -U postgres
python3 direct_subset.py -v
echo "SELECT 'DROP SCHEMA '||nspname||' CASCADE;' FROM pg_catalog.pg_namespace WHERE nspname =ANY('{"ap","da","doi","ecg","emb","gen","ti","ts","tmp"}')" | \
    psql -h localhost -d export -U postgres -t | \
    psql -h localhost -d export -U postgres
now=`date +"%Y-%m-%d"`
mkdir -p dumps
mkdir -p archives
pg_dump -Fc -O -h -o localhost -U postgres -v -d export > ./dumps/$1_dump_${now}.sql
tar -cvf ./archives/$1_dump_${now}.tar -C ./dumps $1_dump_${now}.sql
# -----------------------------------
# |  Clean up files and databases   |
# -----------------------------------
psql -h localhost -U postgres -c "DROP DATABASE export;"
rm ./dumps/$1_dump_${now}.sql
rmdir ./dumps

That's more an FYI about how we're trying to use it though. The key element is that we're just calling condenser with python3 direct_subset.py -v and the config file is linked above in the gist.

The goal of this issue is to note that there seems to be a point at which condenser is hanging, and to figure out a way to debug it so I can fix it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions