Local installation - Workspace says 'Authorization required'

I'm looking into setting up a data commons to host data for a collaborative project my lab's PI is involved in. Following the docker-compose instructions I have a local installation running on my laptop, but the Workspace tab takes me to a page that reports: '401 Authorization Required'

I can see that the Jupyter service is running w/ healthy status, and everything else seems to be working as I expected for this fresh instance. Docker logs jupyter-service returns:

++++
Container must be run with group "root" to update passwd file

Executing the command: jupyter notebook

[W 19:45:48.031 NotebookApp] base_project_url is deprecated, use base_url

[I 19:45:48.070 NotebookApp] Writing notebook server cookie secret to /home/jovyan/.local/share/jupyter/runtime/notebook_cookie_secret

[W 19:45:49.897 NotebookApp] All authentication is disabled. Anyone who can connect to this server will be able to run code.

[I 19:45:50.227 NotebookApp] JupyterLab extension loaded from /opt/conda/lib/python3.6/site-packages/jupyterlab

[I 19:45:50.227 NotebookApp] JupyterLab application directory is /opt/conda/share/jupyter/lab

[I 19:45:50.295 NotebookApp] Serving notebooks from local directory: /home/jovyan

[I 19:45:50.296 NotebookApp] The Jupyter Notebook is running at:

[I 19:45:50.296 NotebookApp] http://(9b10e38926d2 or 127.0.0.1):8888/lw-workspace/

[I 19:45:50.296 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
++++

Can anyone help me figure out what I've done wrong?

Thanks,
John Martin

Hi @jmartin!
Thank you for your question. I asked the gen3 devs about this and I will get back to you soon.

Could you copy paste the user.yaml and the docker-compose.yaml files? Many thanks!

user.yaml

authz:
  # policies automatically given to anyone, even if they are not authenticated
  anonymous_policies:
  - open_data_reader

  # policies automatically given to authenticated users (in addition to their other policies)
  all_users_policies: []

  groups:
  # can CRUD programs and projects and upload data files
  - name: data_submitters
    policies:
    - services.sheepdog-admin
    - data_upload
    - MyFirstProject_submitter
    users:
    - jmartin77777@gmail.com

  # can create/update/delete indexd records
  - name: indexd_admins
    policies:
    - indexd_admin
    users:
    - jmartin77777@gmail.com

  resources:
  - name: workspace
  - name: data_file
  - name: services
    subresources:
    - name: sheepdog
      subresources:
      - name: submission
        subresources:
        - name: program
        - name: project
  - name: open
  - name: programs
    subresources:
    - name: TBRU
      subresources:
      - name: projects
        subresources:
        - name: Aim1

  policies:
  - id: workspace
    description: be able to use workspace
    resource_paths:
    - /workspace
    role_ids:
    - workspace_user
  - id: data_upload
    description: upload raw data files to S3
    role_ids:
    - file_uploader
    resource_paths:
    - /data_file
  - id: services.sheepdog-admin
    description: CRUD access to programs and projects
    role_ids:
      - sheepdog_admin
    resource_paths:
      - /services/sheepdog/submission/program
      - /services/sheepdog/submission/project
  - id: indexd_admin
    description: full access to indexd API
    role_ids:
      - indexd_admin
    resource_paths:
      - /programs
  - id: open_data_reader
    role_ids:
      - reader
      - storage_reader
    resource_paths:
    - /open
  - id: all_programs_reader
    role_ids:
    - reader
    - storage_reader
    resource_paths:
    - /programs
  - id: MyFirstProject_submitter
    role_ids:
    - reader
    - creator
    - updater
    - deleter
    - storage_reader
    - storage_writer
    resource_paths:
    - /programs/TBRU/projects/Aim1

  roles:
  - id: file_uploader
    permissions:
    - id: file_upload
      action:
        service: fence
        method: file_upload
  - id: workspace_user
    permissions:
    - id: workspace_access
      action:
        service: jupyterhub
        method: access
  - id: sheepdog_admin
    description: CRUD access to programs and projects
    permissions:
    - id: sheepdog_admin_action
      action:
        service: sheepdog
        method: '*'
  - id: indexd_admin
    description: full access to indexd API
    permissions:
    - id: indexd_admin
      action:
        service: indexd
        method: '*'
  - id: admin
    permissions:
      - id: admin
        action:
          service: '*'
          method: '*'
  - id: creator
    permissions:
      - id: creator
        action:
          service: '*'
          method: create
  - id: reader
    permissions:
      - id: reader
        action:
          service: '*'
          method: read
  - id: updater
    permissions:
      - id: updater
        action:
          service: '*'
          method: update
  - id: deleter
    permissions:
      - id: deleter
        action:
          service: '*'
          method: delete
  - id: storage_writer
    permissions:
      - id: storage_creator
        action:
          service: '*'
          method: write-storage
  - id: storage_reader
    permissions:
      - id: storage_reader
        action:
          service: '*'
          method: read-storage

clients:
  wts:
    policies:
    - all_programs_reader
    - open_data_reader

users:
  jmartin77777@gmail.com: {}
  username2:
    tags:
      name: John Doe
      email: johndoe@gmail.com
    policies:
    - MyFirstProject_submitter

cloud_providers: {}
groups: {}

docker-compose.yml

version: '3'
services:
  postgres:
    image: postgres:9.6
    networks:
      - devnet
    volumes:
      - "psqldata:/var/lib/postgresql/data"
      - "./scripts/postgres_init.sql:/docker-entrypoint-initdb.d/postgres_init.sql"
    restart: unless-stopped
    healthcheck:
        test: ["CMD-SHELL", "psql -U fence_user -d fence_db -c 'SELECT 1;'"]
        interval: 60s
        timeout: 5s
        retries: 3
    environment:
      - POSTGRES_PASSWORD=postgres
    #
    # uncomment this to make postgres available from the container host - ex:
    #    psql -h localhost -d fence -U fence_user
    ports:
      - 5432:5432
  indexd-service:
    image: "quay.io/cdis/indexd:2020.07"
    command: bash indexd_setup.sh
    container_name: indexd-service
    networks:
      - devnet
    volumes:
      - ./Secrets/indexd_settings.py:/var/www/indexd/local_settings.py
      - ./Secrets/indexd_creds.json:/var/www/indexd/creds.json
      - ./Secrets/config_helper.py:/var/www/indexd/config_helper.py
      - ./scripts/indexd_setup.sh:/var/www/indexd/indexd_setup.sh
    healthcheck:
      test: ["CMD-SHELL", "curl -f http://localhost/_status"]
      interval: 60s
      timeout: 5s
      retries: 3
    depends_on:
      - postgres
  fence-service:
    image: "quay.io/cdis/fence:2020.07"
    command: bash /var/www/fence/fence_setup.sh
    container_name: fence-service
    networks:
      - devnet
    volumes:
      - ./Secrets/fence-config.yaml:/var/www/fence/fence-config.yaml
      - ./Secrets/user.yaml:/var/www/fence/user.yaml
      - ./Secrets/TLS/service.crt:/usr/local/share/ca-certificates/cdis-ca.crt
      - ./Secrets/fenceJwtKeys:/fence/keys
      - ./scripts/fence_setup.sh:/var/www/fence/fence_setup.sh
    healthcheck:
      test: ["CMD-SHELL", "curl -f http://localhost/_status"]
      interval: 60s
      timeout: 5s
      retries: 3
    environment:
      - PYTHONPATH=/var/www/fence
    depends_on:
      - postgres
  arborist-service:
    image: "quay.io/cdis/arborist:2020.07"
    container_name: arborist-service
    entrypoint: bash /go/src/github.com/uc-cdis/arborist/arborist_setup.sh
    networks:
      - devnet
    volumes:
      - ./scripts/arborist_setup.sh:/go/src/github.com/uc-cdis/arborist/arborist_setup.sh
    environment:
      - JWKS_ENDPOINT=http://fence-service/.well-known/jwks
      - PGDATABASE=arborist_db
      - PGUSER=arborist_user
      - PGPASSWORD=arborist_pass
      - PGHOST=postgres
      - PGPORT=5432
      - PGSSLMODE=disable
    healthcheck:
      test: ["CMD-SHELL", "curl -f http://localhost/health"]
      interval: 60s
      timeout: 5s
      retries: 10
    depends_on:
      - postgres
  peregrine-service:
    image: "quay.io/cdis/peregrine:2020.07"
    container_name: peregrine-service
    networks:
      - devnet
    volumes:
      - ./Secrets/peregrine_settings.py:/var/www/peregrine/wsgi.py
      - ./Secrets/peregrine_creds.json:/var/www/peregrine/creds.json
      - ./Secrets/config_helper.py:/var/www/peregrine/config_helper.py
      - ./Secrets/TLS/service.crt:/usr/local/share/ca-certificates/cdis-ca.crt
      - ./scripts/peregrine_setup.sh:/peregrine_setup.sh
      - ./datadictionary/gdcdictionary/schemas:/schemas_dir
    environment: &env
      DICTIONARY_URL: https://s3.amazonaws.com/dictionary-artifacts/datadictionary/develop/schema.json
      # PATH_TO_SCHEMA_DIR: /schemas_dir
      REQUESTS_CA_BUNDLE: /etc/ssl/certs/ca-certificates.crt
    healthcheck:
      test: ["CMD-SHELL", "curl -f http://localhost/_status"]
      interval: 60s
      timeout: 5s
      # give peregrine some extra time to startup
      retries: 10
    depends_on:
      - postgres
      - sheepdog-service
  sheepdog-service:
    image: "quay.io/cdis/sheepdog:2020.07"
    command: bash /sheepdog_setup.sh
    container_name: sheepdog-service
    networks:
      - devnet
    volumes:
      - ./Secrets/sheepdog_settings.py:/var/www/sheepdog/wsgi.py
      - ./Secrets/sheepdog_creds.json:/var/www/sheepdog/creds.json
      - ./Secrets/config_helper.py:/var/www/sheepdog/config_helper.py
      - ./scripts/sheepdog_setup.sh:/sheepdog_setup.sh
      - ./datadictionary/gdcdictionary/schemas:/schemas_dir
    environment: *env
    healthcheck:
      test: ["CMD-SHELL", "curl -f http://localhost/_status"]
      interval: 60s
      timeout: 5s
      retries: 5
    depends_on:
      - postgres
  guppy-service:
    image: "quay.io/cdis/guppy:2020.07"
    container_name: guppy-service
    networks:
      - devnet
    volumes:
      - ./Secrets/guppy_config.json:/guppy/guppy_config.json
    environment:
      - GUPPY_CONFIG_FILEPATH=/guppy/guppy_config.json
      - GEN3_ARBORIST_ENDPOINT=http://arborist-service
      - GEN3_ES_ENDPOINT=http://esproxy-service:9200
    depends_on:
      - arborist-service
      - esproxy-service
  esproxy-service:
    image: docker.elastic.co/elasticsearch/elasticsearch-oss:6.5.4
    container_name: esproxy-service
    environment:
      - cluster.name=elasticsearch-cluster
      - bootstrap.memory_lock=false
      - "ES_JAVA_OPTS=-Xms1g -Xmx1g"
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    ports:
      - 9200:9200
      - 9300:9300
    networks:
      - devnet
  pidgin-service:
    image: "quay.io/cdis/pidgin:2020.07"
    container_name: pidgin-service
    networks:
      - devnet
    volumes:
      - ./scripts/waitForContainers.sh:/var/www/data-portal/waitForContainers.sh
    healthcheck:
      test: ["CMD-SHELL", "curl -f http://localhost/_status"]
      interval: 60s
      timeout: 5s
      retries: 3
    depends_on:
      - peregrine-service
  portal-service:
    image: "quay.io/cdis/data-portal:2020.07"
    container_name: portal-service
    command: ["bash", "/var/www/data-portal/waitForContainers.sh"]
    networks:
      - devnet
    volumes:
      - ./scripts/waitForContainers.sh:/var/www/data-portal/waitForContainers.sh
      - ./Secrets/gitops.json:/data-portal/data/config/gitops.json
      - ./Secrets/gitops-logo.png:/data-portal/custom/logo/gitops-logo.png
      - ./Secrets/gitops.png:/data-portal/custom/createdby/gitops.png
    environment:
      - NODE_ENV=dev
      #- MOCK_STORE=true
      - APP=gitops
      - GDC_SUBPATH=http://revproxy-service/api/v0/submission/

    healthcheck:
      test: ["CMD-SHELL", "curl -f http://localhost"]
      interval: 60s
      timeout: 5s
      retries: 10
    depends_on:
      - postgres
      - peregrine-service
      - sheepdog-service
  jupyter-service:
    image: "quay.io/occ_data/jupyternotebook:1.7.2"
    #image: jupyter/minimal-notebook
    container_name: jupyter-service
    networks:
      - devnet
    volumes:
      - ./scripts/jupyter_config.py:/home/jovyan/.jupyter/jupyter_notebook_config.py
  revproxy-service:
    image: "quay.io/cdis/nginx:1.15.5-ctds"
    container_name: revproxy-service
    networks:
      - devnet
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf
      - ./Secrets/TLS/service.crt:/etc/nginx/ssl/nginx.crt
      - ./Secrets/TLS/service.key:/etc/nginx/ssl/nginx.key
    ports:
      - "80:80"
      - "443:443"
    healthcheck:
      test: ["CMD-SHELL", "curl -f http://localhost"]
      interval: 60s
      timeout: 5s
      retries: 3
    depends_on:
      - indexd-service
      - peregrine-service
      - sheepdog-service
      - fence-service
      - portal-service
      - pidgin-service
  tube-service:
    image: "quay.io/cdis/tube:2020.07"
    container_name: tube-service
    command: bash -c "while true; do sleep 5; done"
    networks:
      - devnet
    environment:
      - DICTIONARY_URL=https://s3.amazonaws.com/dictionary-artifacts/datadictionary/develop/schema.json
      - ES_URL=esproxy-service
      - ES_INDEX_NAME=etl
      - HADOOP_URL=hdfs://spark-service:9000
      - HADOOP_HOST=spark-service
    volumes:
      - ./Secrets/etl_creds.json:/usr/share/gen3/tube/creds.json
      - ./Secrets/etlMapping.yaml:/usr/share/gen3/tube/etlMapping.yaml
      - ./Secrets/user.yaml:/usr/share/gen3/tube/user.yaml
    depends_on:
      - postgres
      - esproxy-service
      - spark-service
  spark-service:
    image: "quay.io/cdis/gen3-spark:2020.07"
    container_name: spark-service
    command: bash -c "python run_config.py && hdfs namenode -format && hdfs --daemon start namenode && hdfs --daemon start datanode && yarn --daemon start resourcemanager && yarn --daemon start nodemanager && hdfs dfsadmin -safemode leave &&  hdfs dfs -mkdir /result && while true; do sleep 5; done"
    expose:
      - 22
      - 8030
      - 8031
      - 8032
      - 9000
    networks:
      - devnet
    environment:
      - HADOOP_URL=hdfs://0.0.0.0:9000
      - HADOOP_HOST=0.0.0.0
  kibana-service:
    image: docker.elastic.co/kibana/kibana-oss:6.5.4
    container_name: kibana-service
    environment:
      - SERVER_NAME=kibana-service
      - ELASTICSEARCH_URL=http://esproxy-service:9200
    ports:
      - 5601:5601
    networks:
      - devnet
    depends_on:
      - esproxy-service
networks:
  devnet:
volumes:
  psqldata:

Thanks!

Workspace access requires the user have the workspace policy attached in user.yaml, below the users section . Try :

policies:
    - MyFirstProject_submitter
    - workspace

And make sure that you have

I tried this and am still getting the '401 Authorization Required' error message. The nginx.conf file was already exactly as you displayed it, I didn't have to change that. And here is the user.yaml file after adding the workspace policy as you suggested:

authz:
  # policies automatically given to anyone, even if they are not authenticated
  anonymous_policies:
  - open_data_reader

  # policies automatically given to authenticated users (in addition to their other policies)
  all_users_policies: []

  groups:
  # can CRUD programs and projects and upload data files
  - name: data_submitters
    policies:
    - services.sheepdog-admin
    - data_upload
    - MyFirstProject_submitter
    users:
    - jmartin77777@gmail.com

  # can create/update/delete indexd records
  - name: indexd_admins
    policies:
    - indexd_admin
    users:
    - jmartin77777@gmail.com

  resources:
  - name: workspace
  - name: data_file
  - name: services
    subresources:
    - name: sheepdog
      subresources:
      - name: submission
        subresources:
        - name: program
        - name: project
  - name: open
  - name: programs
    subresources:
    - name: MyFirstProgram
      subresources:
      - name: projects
        subresources:
        - name: MyFirstProject

  policies:
  - id: workspace
    description: be able to use workspace
    resource_paths:
    - /workspace
    role_ids:
    - workspace_user
  - id: data_upload
    description: upload raw data files to S3
    role_ids:
    - file_uploader
    resource_paths:
    - /data_file
  - id: services.sheepdog-admin
    description: CRUD access to programs and projects
    role_ids:
      - sheepdog_admin
    resource_paths:
      - /services/sheepdog/submission/program
      - /services/sheepdog/submission/project
  - id: indexd_admin
    description: full access to indexd API
    role_ids:
      - indexd_admin
    resource_paths:
      - /programs
  - id: open_data_reader
    role_ids:
      - reader
      - storage_reader
    resource_paths:
    - /open
  - id: all_programs_reader
    role_ids:
    - reader
    - storage_reader
    resource_paths:
    - /programs
  - id: MyFirstProject_submitter
    role_ids:
    - reader
    - creator
    - updater
    - deleter
    - storage_reader
    - storage_writer
    resource_paths:
    - /programs/MyFirstProgram/projects/MyFirstProject

  roles:
  - id: file_uploader
    permissions:
    - id: file_upload
      action:
        service: fence
        method: file_upload
  - id: workspace_user
    permissions:
    - id: workspace_access
      action:
        service: jupyterhub
        method: access
  - id: sheepdog_admin
    description: CRUD access to programs and projects
    permissions:
    - id: sheepdog_admin_action
      action:
        service: sheepdog
        method: '*'
  - id: indexd_admin
    description: full access to indexd API
    permissions:
    - id: indexd_admin
      action:
        service: indexd
        method: '*'
  - id: admin
    permissions:
      - id: admin
        action:
          service: '*'
          method: '*'
  - id: creator
    permissions:
      - id: creator
        action:
          service: '*'
          method: create
  - id: reader
    permissions:
      - id: reader
        action:
          service: '*'
          method: read
  - id: updater
    permissions:
      - id: updater
        action:
          service: '*'
          method: update
  - id: deleter
    permissions:
      - id: deleter
        action:
          service: '*'
          method: delete
  - id: storage_writer
    permissions:
      - id: storage_creator
        action:
          service: '*'
          method: write-storage
  - id: storage_reader
    permissions:
      - id: storage_reader
        action:
          service: '*'
          method: read-storage

clients:
  wts:
    policies:
    - all_programs_reader
    - open_data_reader

users:
  jmartin77777@gmail.com:
    tags:
      name: John Martin
      email: jmartin77777@gmail.com
    policies:
    - MyFirstProject_submitter
    - workspace
  username2:
    tags:
      name: John Doe
      email: johndoe@gmail.com
    policies:
    - MyFirstProject_submitter

cloud_providers: {}
groups: {}

I am not fluent in yaml so I am not 100% sure you meant to add the workspace policy directly under jmartin77777@gmail.com in the users: block, or if you mean to include it at the end, which to me looked like that would assign it to the 'username2:' entry.

Another issue I am working on is that I haven't currently setup hosting for this local docker-compose setup. My AWS account is through my university, and they did not give me permission to setup an IAM that I could use to build a secure s3 bucket. I've got a request in for that but as I run this now there is no s3 bucket behind this instance. Could that be the reason that the workspace is unable to start? Does it need to host workspaces inside a mounted s3 bucket? Or can it use space internal to the container?

Hi @jmartin

no worries. One question:

Did you run the sync users command after updating user.yaml ? https://github.com/uc-cdis/compose-services/blob/bbf3325d9a28429ff4bf64d84332132a953277d1/docs/cheat_sheet.md

The workspaces in the docker-compose setup don't require anything from AWS :slight_smile:

Ah, because I was making some other changes in the gitops.json file, and I am still not sure how to sync a json file into the running container, I just stopped everything & cleaned up and then re-ran docker-compose up from scratch (after making the changes discussed). I think I am doing a lot of things inefficiently at the moment, but as I try to understand the system I find some comfort in sticking with things that I know are working for me :slight_smile:

What I am hoping I can do eventually is figure out how to mount local directories to my docker-compose instance so I can just change files in place. But one thing at a time...

Hi @jmartin!

Would you like to join the #Gen3_community slack channel, where Gen3 enthusiast share their experience in setting up and configuring Gen3?

Yes, that would be great!

Invite sent! Cheers!

Thank you! I will move my question over there

Can you please add me to Gen3 slack channel?

Hi @Shyam,

Welcome to the forum :slight_smile:

We will send you an invite next week, I'll let you know when invite sent :slight_smile:

@Shyam, invite sent :slight_smile: