Local installation - Workspace says 'Authorization required'

jmartin · August 24, 2020, 9:56pm

I'm looking into setting up a data commons to host data for a collaborative project my lab's PI is involved in. Following the docker-compose instructions I have a local installation running on my laptop, but the Workspace tab takes me to a page that reports: '401 Authorization Required'

I can see that the Jupyter service is running w/ healthy status, and everything else seems to be working as I expected for this fresh instance. Docker logs jupyter-service returns:

++++
Container must be run with group "root" to update passwd file

Executing the command: jupyter notebook

[W 19:45:48.031 NotebookApp] base_project_url is deprecated, use base_url

[I 19:45:48.070 NotebookApp] Writing notebook server cookie secret to /home/jovyan/.local/share/jupyter/runtime/notebook_cookie_secret

[W 19:45:49.897 NotebookApp] All authentication is disabled. Anyone who can connect to this server will be able to run code.

[I 19:45:50.227 NotebookApp] JupyterLab extension loaded from /opt/conda/lib/python3.6/site-packages/jupyterlab

[I 19:45:50.227 NotebookApp] JupyterLab application directory is /opt/conda/share/jupyter/lab

[I 19:45:50.295 NotebookApp] Serving notebooks from local directory: /home/jovyan

[I 19:45:50.296 NotebookApp] The Jupyter Notebook is running at:

[I 19:45:50.296 NotebookApp] http://(9b10e38926d2 or 127.0.0.1):8888/lw-workspace/

[I 19:45:50.296 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
++++

Can anyone help me figure out what I've done wrong?

Thanks,
John Martin

xritter2 · August 27, 2020, 9:41pm

Hi @jmartin!
Thank you for your question. I asked the gen3 devs about this and I will get back to you soon.

xritter2 · August 27, 2020, 10:41pm

Could you copy paste the user.yaml and the docker-compose.yaml files? Many thanks!

jmartin · August 27, 2020, 11:40pm

user.yaml

authz:
  # policies automatically given to anyone, even if they are not authenticated
  anonymous_policies:
  - open_data_reader

  # policies automatically given to authenticated users (in addition to their other policies)
  all_users_policies: []

  groups:
  # can CRUD programs and projects and upload data files
  - name: data_submitters
    policies:
    - services.sheepdog-admin
    - data_upload
    - MyFirstProject_submitter
    users:
    - jmartin77777@gmail.com

  # can create/update/delete indexd records
  - name: indexd_admins
    policies:
    - indexd_admin
    users:
    - jmartin77777@gmail.com

  resources:
  - name: workspace
  - name: data_file
  - name: services
    subresources:
    - name: sheepdog
      subresources:
      - name: submission
        subresources:
        - name: program
        - name: project
  - name: open
  - name: programs
    subresources:
    - name: TBRU
      subresources:
      - name: projects
        subresources:
        - name: Aim1

  policies:
  - id: workspace
    description: be able to use workspace
    resource_paths:
    - /workspace
    role_ids:
    - workspace_user
  - id: data_upload
    description: upload raw data files to S3
    role_ids:
    - file_uploader
    resource_paths:
    - /data_file
  - id: services.sheepdog-admin
    description: CRUD access to programs and projects
    role_ids:
      - sheepdog_admin
    resource_paths:
      - /services/sheepdog/submission/program
      - /services/sheepdog/submission/project
  - id: indexd_admin
    description: full access to indexd API
    role_ids:
      - indexd_admin
    resource_paths:
      - /programs
  - id: open_data_reader
    role_ids:
      - reader
      - storage_reader
    resource_paths:
    - /open
  - id: all_programs_reader
    role_ids:
    - reader
    - storage_reader
    resource_paths:
    - /programs
  - id: MyFirstProject_submitter
    role_ids:
    - reader
    - creator
    - updater
    - deleter
    - storage_reader
    - storage_writer
    resource_paths:
    - /programs/TBRU/projects/Aim1

  roles:
  - id: file_uploader
    permissions:
    - id: file_upload
      action:
        service: fence
        method: file_upload
  - id: workspace_user
    permissions:
    - id: workspace_access
      action:
        service: jupyterhub
        method: access
  - id: sheepdog_admin
    description: CRUD access to programs and projects
    permissions:
    - id: sheepdog_admin_action
      action:
        service: sheepdog
        method: '*'
  - id: indexd_admin
    description: full access to indexd API
    permissions:
    - id: indexd_admin
      action:
        service: indexd
        method: '*'
  - id: admin
    permissions:
      - id: admin
        action:
          service: '*'
          method: '*'
  - id: creator
    permissions:
      - id: creator
        action:
          service: '*'
          method: create
  - id: reader
    permissions:
      - id: reader
        action:
          service: '*'
          method: read
  - id: updater
    permissions:
      - id: updater
        action:
          service: '*'
          method: update
  - id: deleter
    permissions:
      - id: deleter
        action:
          service: '*'
          method: delete
  - id: storage_writer
    permissions:
      - id: storage_creator
        action:
          service: '*'
          method: write-storage
  - id: storage_reader
    permissions:
      - id: storage_reader
        action:
          service: '*'
          method: read-storage

clients:
  wts:
    policies:
    - all_programs_reader
    - open_data_reader

users:
  jmartin77777@gmail.com: {}
  username2:
    tags:
      name: John Doe
      email: johndoe@gmail.com
    policies:
    - MyFirstProject_submitter

cloud_providers: {}
groups: {}

jmartin · August 27, 2020, 11:41pm

docker-compose.yml

version: '3'
services:
  postgres:
    image: postgres:9.6
    networks:
      - devnet
    volumes:
      - "psqldata:/var/lib/postgresql/data"
      - "./scripts/postgres_init.sql:/docker-entrypoint-initdb.d/postgres_init.sql"
    restart: unless-stopped
    healthcheck:
        test: ["CMD-SHELL", "psql -U fence_user -d fence_db -c 'SELECT 1;'"]
        interval: 60s
        timeout: 5s
        retries: 3
    environment:
      - POSTGRES_PASSWORD=postgres
    #
    # uncomment this to make postgres available from the container host - ex:
    #    psql -h localhost -d fence -U fence_user
    ports:
      - 5432:5432
  indexd-service:
    image: "quay.io/cdis/indexd:2020.07"
    command: bash indexd_setup.sh
    container_name: indexd-service
    networks:
      - devnet
    volumes:
      - ./Secrets/indexd_settings.py:/var/www/indexd/local_settings.py
      - ./Secrets/indexd_creds.json:/var/www/indexd/creds.json
      - ./Secrets/config_helper.py:/var/www/indexd/config_helper.py
      - ./scripts/indexd_setup.sh:/var/www/indexd/indexd_setup.sh
    healthcheck:
      test: ["CMD-SHELL", "curl -f http://localhost/_status"]
      interval: 60s
      timeout: 5s
      retries: 3
    depends_on:
      - postgres
  fence-service:
    image: "quay.io/cdis/fence:2020.07"
    command: bash /var/www/fence/fence_setup.sh
    container_name: fence-service
    networks:
      - devnet
    volumes:
      - ./Secrets/fence-config.yaml:/var/www/fence/fence-config.yaml
      - ./Secrets/user.yaml:/var/www/fence/user.yaml
      - ./Secrets/TLS/service.crt:/usr/local/share/ca-certificates/cdis-ca.crt
      - ./Secrets/fenceJwtKeys:/fence/keys
      - ./scripts/fence_setup.sh:/var/www/fence/fence_setup.sh
    healthcheck:
      test: ["CMD-SHELL", "curl -f http://localhost/_status"]
      interval: 60s
      timeout: 5s
      retries: 3
    environment:
      - PYTHONPATH=/var/www/fence
    depends_on:
      - postgres
  arborist-service:
    image: "quay.io/cdis/arborist:2020.07"
    container_name: arborist-service
    entrypoint: bash /go/src/github.com/uc-cdis/arborist/arborist_setup.sh
    networks:
      - devnet
    volumes:
      - ./scripts/arborist_setup.sh:/go/src/github.com/uc-cdis/arborist/arborist_setup.sh
    environment:
      - JWKS_ENDPOINT=http://fence-service/.well-known/jwks
      - PGDATABASE=arborist_db
      - PGUSER=arborist_user
      - PGPASSWORD=arborist_pass
      - PGHOST=postgres
      - PGPORT=5432
      - PGSSLMODE=disable
    healthcheck:
      test: ["CMD-SHELL", "curl -f http://localhost/health"]
      interval: 60s
      timeout: 5s
      retries: 10
    depends_on:
      - postgres
  peregrine-service:
    image: "quay.io/cdis/peregrine:2020.07"
    container_name: peregrine-service
    networks:
      - devnet
    volumes:
      - ./Secrets/peregrine_settings.py:/var/www/peregrine/wsgi.py
      - ./Secrets/peregrine_creds.json:/var/www/peregrine/creds.json
      - ./Secrets/config_helper.py:/var/www/peregrine/config_helper.py
      - ./Secrets/TLS/service.crt:/usr/local/share/ca-certificates/cdis-ca.crt
      - ./scripts/peregrine_setup.sh:/peregrine_setup.sh
      - ./datadictionary/gdcdictionary/schemas:/schemas_dir
    environment: &env
      DICTIONARY_URL: https://s3.amazonaws.com/dictionary-artifacts/datadictionary/develop/schema.json
      # PATH_TO_SCHEMA_DIR: /schemas_dir
      REQUESTS_CA_BUNDLE: /etc/ssl/certs/ca-certificates.crt
    healthcheck:
      test: ["CMD-SHELL", "curl -f http://localhost/_status"]
      interval: 60s
      timeout: 5s
      # give peregrine some extra time to startup
      retries: 10
    depends_on:
      - postgres
      - sheepdog-service
  sheepdog-service:
    image: "quay.io/cdis/sheepdog:2020.07"
    command: bash /sheepdog_setup.sh
    container_name: sheepdog-service
    networks:
      - devnet
    volumes:
      - ./Secrets/sheepdog_settings.py:/var/www/sheepdog/wsgi.py
      - ./Secrets/sheepdog_creds.json:/var/www/sheepdog/creds.json
      - ./Secrets/config_helper.py:/var/www/sheepdog/config_helper.py
      - ./scripts/sheepdog_setup.sh:/sheepdog_setup.sh
      - ./datadictionary/gdcdictionary/schemas:/schemas_dir
    environment: *env
    healthcheck:
      test: ["CMD-SHELL", "curl -f http://localhost/_status"]
      interval: 60s
      timeout: 5s
      retries: 5
    depends_on:
      - postgres
  guppy-service:
    image: "quay.io/cdis/guppy:2020.07"
    container_name: guppy-service
    networks:
      - devnet
    volumes:
      - ./Secrets/guppy_config.json:/guppy/guppy_config.json
    environment:
      - GUPPY_CONFIG_FILEPATH=/guppy/guppy_config.json
      - GEN3_ARBORIST_ENDPOINT=http://arborist-service
      - GEN3_ES_ENDPOINT=http://esproxy-service:9200
    depends_on:
      - arborist-service
      - esproxy-service
  esproxy-service:
    image: docker.elastic.co/elasticsearch/elasticsearch-oss:6.5.4
    container_name: esproxy-service
    environment:
      - cluster.name=elasticsearch-cluster
      - bootstrap.memory_lock=false
      - "ES_JAVA_OPTS=-Xms1g -Xmx1g"
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    ports:
      - 9200:9200
      - 9300:9300
    networks:
      - devnet
  pidgin-service:
    image: "quay.io/cdis/pidgin:2020.07"
    container_name: pidgin-service
    networks:
      - devnet
    volumes:
      - ./scripts/waitForContainers.sh:/var/www/data-portal/waitForContainers.sh
    healthcheck:
      test: ["CMD-SHELL", "curl -f http://localhost/_status"]
      interval: 60s
      timeout: 5s
      retries: 3
    depends_on:
      - peregrine-service
  portal-service:
    image: "quay.io/cdis/data-portal:2020.07"
    container_name: portal-service
    command: ["bash", "/var/www/data-portal/waitForContainers.sh"]
    networks:
      - devnet
    volumes:
      - ./scripts/waitForContainers.sh:/var/www/data-portal/waitForContainers.sh
      - ./Secrets/gitops.json:/data-portal/data/config/gitops.json
      - ./Secrets/gitops-logo.png:/data-portal/custom/logo/gitops-logo.png
      - ./Secrets/gitops.png:/data-portal/custom/createdby/gitops.png
    environment:
      - NODE_ENV=dev
      #- MOCK_STORE=true
      - APP=gitops
      - GDC_SUBPATH=http://revproxy-service/api/v0/submission/

    healthcheck:
      test: ["CMD-SHELL", "curl -f http://localhost"]
      interval: 60s
      timeout: 5s
      retries: 10
    depends_on:
      - postgres
      - peregrine-service
      - sheepdog-service
  jupyter-service:
    image: "quay.io/occ_data/jupyternotebook:1.7.2"
    #image: jupyter/minimal-notebook
    container_name: jupyter-service
    networks:
      - devnet
    volumes:
      - ./scripts/jupyter_config.py:/home/jovyan/.jupyter/jupyter_notebook_config.py
  revproxy-service:
    image: "quay.io/cdis/nginx:1.15.5-ctds"
    container_name: revproxy-service
    networks:
      - devnet
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf
      - ./Secrets/TLS/service.crt:/etc/nginx/ssl/nginx.crt
      - ./Secrets/TLS/service.key:/etc/nginx/ssl/nginx.key
    ports:
      - "80:80"
      - "443:443"
    healthcheck:
      test: ["CMD-SHELL", "curl -f http://localhost"]
      interval: 60s
      timeout: 5s
      retries: 3
    depends_on:
      - indexd-service
      - peregrine-service
      - sheepdog-service
      - fence-service
      - portal-service
      - pidgin-service
  tube-service:
    image: "quay.io/cdis/tube:2020.07"
    container_name: tube-service
    command: bash -c "while true; do sleep 5; done"
    networks:
      - devnet
    environment:
      - DICTIONARY_URL=https://s3.amazonaws.com/dictionary-artifacts/datadictionary/develop/schema.json
      - ES_URL=esproxy-service
      - ES_INDEX_NAME=etl
      - HADOOP_URL=hdfs://spark-service:9000
      - HADOOP_HOST=spark-service
    volumes:
      - ./Secrets/etl_creds.json:/usr/share/gen3/tube/creds.json
      - ./Secrets/etlMapping.yaml:/usr/share/gen3/tube/etlMapping.yaml
      - ./Secrets/user.yaml:/usr/share/gen3/tube/user.yaml
    depends_on:
      - postgres
      - esproxy-service
      - spark-service
  spark-service:
    image: "quay.io/cdis/gen3-spark:2020.07"
    container_name: spark-service
    command: bash -c "python run_config.py && hdfs namenode -format && hdfs --daemon start namenode && hdfs --daemon start datanode && yarn --daemon start resourcemanager && yarn --daemon start nodemanager && hdfs dfsadmin -safemode leave &&  hdfs dfs -mkdir /result && while true; do sleep 5; done"
    expose:
      - 22
      - 8030
      - 8031
      - 8032
      - 9000
    networks:
      - devnet
    environment:
      - HADOOP_URL=hdfs://0.0.0.0:9000
      - HADOOP_HOST=0.0.0.0
  kibana-service:
    image: docker.elastic.co/kibana/kibana-oss:6.5.4
    container_name: kibana-service
    environment:
      - SERVER_NAME=kibana-service
      - ELASTICSEARCH_URL=http://esproxy-service:9200
    ports:
      - 5601:5601
    networks:
      - devnet
    depends_on:
      - esproxy-service
networks:
  devnet:
volumes:
  psqldata:

xritter2 · August 28, 2020, 3:19pm

Thanks!

Workspace access requires the user have the workspace policy attached in user.yaml, below the users section . Try :

policies:
    - MyFirstProject_submitter
    - workspace

And make sure that you have

github.com

uc-cdis/compose-services/blob/master/nginx.conf#L188


    proxy_set_header Content-Length "";
    proxy_intercept_errors on;
    # nginx bug that it checks even if request_body off
    client_max_body_size 0;
}

location /lw-workspace/ {
    set $authz_resource "/workspace";
    set $authz_method "access";
    set $authz_service "jupyterhub";
    auth_request /gen3-authz;

    error_page 403 = @errorworkspace;
    
    #
    # jupyter notebooks use websockets
    # See https://aptro.github.io/server/architecture/2016/06/21/Jupyter-Notebook-Nginx-Setup.html
    #
    proxy_pass http://jupyter-service:8888/lw-workspace/;
    proxy_http_version 1.1;
    proxy_set_header Host $host;

jmartin · August 28, 2020, 5:19pm

I tried this and am still getting the '401 Authorization Required' error message. The nginx.conf file was already exactly as you displayed it, I didn't have to change that. And here is the user.yaml file after adding the workspace policy as you suggested:

authz:
  # policies automatically given to anyone, even if they are not authenticated
  anonymous_policies:
  - open_data_reader

  # policies automatically given to authenticated users (in addition to their other policies)
  all_users_policies: []

  groups:
  # can CRUD programs and projects and upload data files
  - name: data_submitters
    policies:
    - services.sheepdog-admin
    - data_upload
    - MyFirstProject_submitter
    users:
    - jmartin77777@gmail.com

  # can create/update/delete indexd records
  - name: indexd_admins
    policies:
    - indexd_admin
    users:
    - jmartin77777@gmail.com

  resources:
  - name: workspace
  - name: data_file
  - name: services
    subresources:
    - name: sheepdog
      subresources:
      - name: submission
        subresources:
        - name: program
        - name: project
  - name: open
  - name: programs
    subresources:
    - name: MyFirstProgram
      subresources:
      - name: projects
        subresources:
        - name: MyFirstProject

  policies:
  - id: workspace
    description: be able to use workspace
    resource_paths:
    - /workspace
    role_ids:
    - workspace_user
  - id: data_upload
    description: upload raw data files to S3
    role_ids:
    - file_uploader
    resource_paths:
    - /data_file
  - id: services.sheepdog-admin
    description: CRUD access to programs and projects
    role_ids:
      - sheepdog_admin
    resource_paths:
      - /services/sheepdog/submission/program
      - /services/sheepdog/submission/project
  - id: indexd_admin
    description: full access to indexd API
    role_ids:
      - indexd_admin
    resource_paths:
      - /programs
  - id: open_data_reader
    role_ids:
      - reader
      - storage_reader
    resource_paths:
    - /open
  - id: all_programs_reader
    role_ids:
    - reader
    - storage_reader
    resource_paths:
    - /programs
  - id: MyFirstProject_submitter
    role_ids:
    - reader
    - creator
    - updater
    - deleter
    - storage_reader
    - storage_writer
    resource_paths:
    - /programs/MyFirstProgram/projects/MyFirstProject

  roles:
  - id: file_uploader
    permissions:
    - id: file_upload
      action:
        service: fence
        method: file_upload
  - id: workspace_user
    permissions:
    - id: workspace_access
      action:
        service: jupyterhub
        method: access
  - id: sheepdog_admin
    description: CRUD access to programs and projects
    permissions:
    - id: sheepdog_admin_action
      action:
        service: sheepdog
        method: '*'
  - id: indexd_admin
    description: full access to indexd API
    permissions:
    - id: indexd_admin
      action:
        service: indexd
        method: '*'
  - id: admin
    permissions:
      - id: admin
        action:
          service: '*'
          method: '*'
  - id: creator
    permissions:
      - id: creator
        action:
          service: '*'
          method: create
  - id: reader
    permissions:
      - id: reader
        action:
          service: '*'
          method: read
  - id: updater
    permissions:
      - id: updater
        action:
          service: '*'
          method: update
  - id: deleter
    permissions:
      - id: deleter
        action:
          service: '*'
          method: delete
  - id: storage_writer
    permissions:
      - id: storage_creator
        action:
          service: '*'
          method: write-storage
  - id: storage_reader
    permissions:
      - id: storage_reader
        action:
          service: '*'
          method: read-storage

clients:
  wts:
    policies:
    - all_programs_reader
    - open_data_reader

users:
  jmartin77777@gmail.com:
    tags:
      name: John Martin
      email: jmartin77777@gmail.com
    policies:
    - MyFirstProject_submitter
    - workspace
  username2:
    tags:
      name: John Doe
      email: johndoe@gmail.com
    policies:
    - MyFirstProject_submitter

cloud_providers: {}
groups: {}

I am not fluent in yaml so I am not 100% sure you meant to add the workspace policy directly under jmartin77777@gmail.com in the users: block, or if you mean to include it at the end, which to me looked like that would assign it to the 'username2:' entry.

Another issue I am working on is that I haven't currently setup hosting for this local docker-compose setup. My AWS account is through my university, and they did not give me permission to setup an IAM that I could use to build a secure s3 bucket. I've got a request in for that but as I run this now there is no s3 bucket behind this instance. Could that be the reason that the workspace is unable to start? Does it need to host workspaces inside a mounted s3 bucket? Or can it use space internal to the container?

xritter2 · August 28, 2020, 5:59pm

Hi @jmartin

no worries. One question:

Did you run the sync users command after updating user.yaml ? https://github.com/uc-cdis/compose-services/blob/bbf3325d9a28429ff4bf64d84332132a953277d1/docs/cheat_sheet.md

The workspaces in the docker-compose setup don't require anything from AWS

jmartin · August 28, 2020, 6:13pm

Ah, because I was making some other changes in the gitops.json file, and I am still not sure how to sync a json file into the running container, I just stopped everything & cleaned up and then re-ran docker-compose up from scratch (after making the changes discussed). I think I am doing a lot of things inefficiently at the moment, but as I try to understand the system I find some comfort in sticking with things that I know are working for me

jmartin · August 28, 2020, 6:14pm

What I am hoping I can do eventually is figure out how to mount local directories to my docker-compose instance so I can just change files in place. But one thing at a time...

xritter2 · August 28, 2020, 6:40pm

Hi @jmartin!

Would you like to join the #Gen3_community slack channel, where Gen3 enthusiast share their experience in setting up and configuring Gen3?

jmartin · August 28, 2020, 6:45pm

Yes, that would be great!

xritter2 · August 28, 2020, 7:38pm

Invite sent! Cheers!

jmartin · August 28, 2020, 7:40pm

Thank you! I will move my question over there

Shyam · October 2, 2020, 4:57pm

Can you please add me to Gen3 slack channel?

Viktorija · October 3, 2020, 4:55am

Hi @Shyam,

Welcome to the forum

We will send you an invite next week, I'll let you know when invite sent

Viktorija · October 5, 2020, 6:35pm

@Shyam, invite sent

Sri · September 23, 2021, 4:55pm

@Viktorija Could you also invite me?

Viktorija · September 23, 2021, 5:45pm

Hi @Sri,
We sent you the invite. Welcome to Gen3 community

Topic		Replies	Views
Issue with Compose Services Using Gen3	39	2376	March 13, 2020
Can't submit data Using Gen3	17	1390	June 30, 2022
Look up user authentication Using Gen3	6	667	October 6, 2020
Workplace and Jupyter NB page is not work correctly Using Gen3	2	395	July 21, 2021
Compose Services upload issues Using Gen3	7	1291	July 27, 2019

Local installation - Workspace says 'Authorization required'

user.yaml

docker-compose.yml

Related topics