-
Notifications
You must be signed in to change notification settings - Fork 0
Staging Debugging Guide
Jorge Silva edited this page Mar 30, 2016
·
11 revisions
So you yell and you yell and no one is fixing staging. Well, now you can debug yourself :)
- Go to runnable.io.
- Ensure rabbitMQ, redis, and mongo are up. If any of these are down, simply click
start container(DO NOT REBUILD THESE CONTAINERS). But if you did .... - mongo flushed
- redis flushed
- Ensure supporting services are up:
- api
-
api-worker WORKER-do-not-delete branch (updated to latest code
git fetch && git merge origin/master) and restart container - runnable-angular
-
mavis
** hit
http://mavis-staging-codenow.runnableapp.com/docksto ensure docks are available - optimus
- to get navi to work you have to do something special using navi on staging To ensure things are up do the following:
- Go to terminal of the service
-
curlthe respective port, example:curl localhost:80for api: 1. If you do not see any output of curl (or it hangs) then start/stop or click latest commit. - Run
ip addr | grep ethweand ensure the output contains ethwe. If not, rebuild container. - Inspect the logs by opening a terminal and running:
1.
npm install bunyan -g2.LOG_LEVEL_STDOUT=trace npm start | bunyan
Things to check:
- Go to
runnable.io - Look for
WORKER-DO-NOT-DELETEbranch in api. - Make sure it's green.
- Go into CMD logs and see if there are any errors.
- Go to
runnable.io - Look for following repos to be green:
- Swarm
- Sauron
- docker-listener
- Neo4j
- SSH into
delta-staging-data(ssh delta-staging-data) - Run
sudo docker ps - Check that all of the following are running
- MongoDB
- Redis
- RabbitMQ
- Consul
- Vault
[2016-03-30T18:47:10.179Z] WARN: api/12 on c7263d6ab6fa (/api/node_modules/ponos/lib/worker.js:231 in unknownErrRetry): Task failed, retrying (environment=staging, module=lib/models/rabbitmq/index.js, queue=on-image-builder-container-create, nextAttemptDelay=1048576)
Error: Container action start failed: connect EHOSTUNREACH
at Object.exports.create (/api/node_modules/dat-middleware/node_modules/boom/lib/index.js:21:17)
at Docker.<anonymous> (/api/lib/models/apis/docker.js:1112:17)
at Object.callback (/api/lib/models/apis/docker.js:1067:49)
at /api/node_modules/dockerode/lib/container.js:180:10
at done (/api/node_modules/dogerode/index.js:28:7)
at Modem.buildPayload (/api/node_modules/docker-modem/lib/modem.js:225:19)
at ClientRequest.<anonymous> (/api/node_modules/docker-modem/lib/modem.js:210:10)
at ClientRequest.EventEmitter.emit (events.js:95:17)
at CleartextStream.socketErrorListener (http.js:1547:9)
at CleartextStream.EventEmitter.emit (events.js:95:17)
at Socket.onerror (tls.js:1445:17)
at Socket.EventEmitter.emit (events.js:117:20)
at net.js:440:14
at process._tickDomainCallback (node.js:463:13)
If you see this error, this is probably an error in the API worker trying to connect to docker. Either swarm might be done or there might be some kind of DNS issue where the swarm container url (swarm-staging-codenow.runnableapp.com).
- Go to the swarm container in runnable.io
- Go to Terminal
- Type
docker info - You should see a list similar to this with more than one container and more than one image:
root@7785a0fe6162:/sauron# docker info
Containers: 156
Images: 54
Storage Driver:
Role: primary
Strategy: spread
Filters: health, port, dependency, affinity, constraint
Nodes: 4
ip-10-8-166-119.2335750: 10.8.166.119:4242
└ Status: Healthy
└ Containers: 60
└ Reserved CPUs: 0 / 2
└ Reserved Memory: 0 B / 8.187 GiB
└ Labels: executiondriver=native-0.2, kernelversion=3.13.0-79-generic, operati
ngsystem=Ubuntu 14.04.4 LTS, org=2335750, storagedriver=aufs
└ Error: (none)
└ UpdatedAt: 2016-03-30T18:44:59Z
...
- Using docks CLI, list all the docks for staging
docks list -e stageordocks aws list -e stage - SSH into one of those docks
- Type
sudo docker info - You should see a list similar to this with more than one container and more than one image:
root@7785a0fe6162:/sauron# docker info
Containers: 156
Images: 54
Things to check