Issue in elastic search while building testbed using compose-services

Hi !

I'm trying to build testbed environment of gen3 using docker-compose ('compose-services'). However following error found on guppy service which infers unavailability of esproxy-service.

[00:04:24] INFO: [ES.initialize] getting mapping from elasticsearch...
guppy-service | (node:7) UnhandledPromiseRejectionWarning: Error: [ES.initialize] error getting mapping from ES index "file": connect ECONNREFUSED 172.19.0.3:9200
guppy-service | at client.indices.getMapping.then.err (/guppy/dist/server/es/index.js:170:13)
guppy-service | at process._tickCallback (internal/process/next_tick.js:68:7)
guppy-service | (node:7) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
guppy-service | (node:7) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
guppy-service | [00:04:25] ERROR: [ES] elasticsearch cluster at http://esproxy-service:9200 is down!

However esproxy servrice is available if accessed standalone at http://esproxy-service:9200. I guess some configuration error. Pls suggest.

Hi @srrk! Welcome to the forum!

Could you share your configuration and logs? Here is a description how to do it:

It would help to understand better what could go wrong.

Hi @Viktorija,

Thanks for your quick response. I ran the command 'bash dump.sh'. However the command is taking long time in the 'Dumping logs' stage. On exploration I found the file 'logs-spark-service.txt' is getting flooded with following message incessantly.

spark-service | Re-format filesystem in Storage Directory root= /hadoop/hdfs/data/dfs/namenode; location= null ? (Y or N) Invalid input:

On further check I found the 'spark-service' itself is getting continous messages in logs (same as above) and infact choking the space in container images.

Pls check and suggest.

Hi @srrk,

I will consult with our developers on this, thank you for providing this info! I will update you when have more information.

Hi @Viktorija,

I deleted all the container images using 'rmi' option in docker-compose and started all services afresh. Now the error in spark-service is not noticed however the reported issue in elastic search reproduced. I have the generated zip file of logs and config. Can you tell me how to share the zip file here in forum.

Thanks.

Oh, good to know, thanks for the update! You can upload the zip to any convenient resource like dropbox or google drive or other of your choice, make it sharable, and share the link here.

Thanks. Here I share the url to dropbox link.

Thank you for sharing your logs. Are you able to connect to the https://localhost (it may take several minutes for the portal to finally come up for the first time)?

No luck. Elastic search is running on localhost:9200. However nginix is still not (after many hours) active. I dumped the logs again and sharing here. Pls chk.

Thank you for the update! I created a ticket to investigate this behavior, and I will keep you updated.

Hi @srrk!

For the nginx issue our developers advice to comment out these lines in the ngninx.conf:

#        location /guppy/ {
#                proxy_pass http://guppy-service/;
#        }

The reason is that you don't have data in your commons yet so guppy can not be configured. Once you put data, you can configure guppy as described here.

For the spark-service issue you mentioned before:

spark-service | Re-format filesystem in Storage Directory root= /hadoop/hdfs/data/dfs/namenode; location= null ? (Y or N) Invalid input:
`

Sometimes it happens if spark is restarted without running docker-compose down. So in this case executing docker-compose down and then docker-compose up -d should help spark function normally.

Hi !

Thanks for the update. Now the reverse-proxy is working however we are getting 'Internal Server Error'. On investigation we found following error in sheepdog service.

sheepdog-service | ERROR:root:Rolling back session [404] - program getschema not found
sheepdog-service | [2020-01-30 20:43:17,841][sheepdog.api][ ERROR] [404] - program getschema not found

Pls suggest.

Hi @srrk!

I can not reproduce this on my laptop. Could you send me your logs again?

Hi !

I restarted all the services by purging the containers. Now it's working, I'm able to get the data-portal (windmill).

Thanks.

1 Like

I'm glad it works now :smiley: