Attempting to download data, "x509: certificate signed by unknown authority"

I'm following these steps to download TCGA PCAWG data from the PDC: https://docs.icgc.org/pcawg/data/#download-from-pdc

When I run gen3-client download-multiple --profile=icgc --manifest=gen3_manifest_manifest.pdc.1604342264431.sh.json --no-prompt, it looks like I get an error for every single file along the lines of: Error occurred when doing GET req for URL [really long URL here] Details of error: Get "[another long URL]": x509: certificate signed by unknown authority

When I run gen3-client auth --profile=icgc, I get: You have access to the following project(s) at https://icgc.bionimbus.org: 2020/11/02 18:48:13 phs000178 [read read-storage] 2020/11/02 18:48:13 phs000235 [read read-storage]

I'm running everything from a mac.

How can I fix this?

Thanks!

Hello @sdalin,
welcome to the Forum!

A quick follow-up on the gen3-client. Would you mind letting us know which version you used?
If you used version 2020.11, please download the 2020.10 version and try again. Please let us know if that solved the issue.

Best wishes,
Xenia

Hi @xritter2, thanks for your advice! I wasn't sure how to check the version of gen3-client that I'm currently using, but I tried re-downloading 2020.10, and got the same error.

I just realized that I can try gen3-client auth --profile=icgc. The full output is
2020/11/03 11:38:21 A new version of gen3-client is available! The latest version is 2020.11.0. You are using version 2020.10
2020/11/03 11:38:21 Please download the latest gen3-client release from https://github.com/uc-cdis/cdis-data-client/releases/latest
2020/11/03 11:38:21 You have access to the following project(s) at https://icgc.bionimbus.org:
2020/11/03 11:38:21 phs000178 [read read-storage]
2020/11/03 11:38:21 phs000235 [read read-storage]

So it looks like I am using 2020.10 now.
Any other suggestions?

Thanks for getting back! Do the errors also appear when you download one single file?

Regarding the unknown certificate, take a look here: Compose Services upload issues and https://gen3.org/resources/faq/

Ok, for future people who read this who have no idea what it means to add a self-signed certificate to your trusted certificates, I followed these instructions (for mac): https://support.apple.com/guide/keychain-access/create-self-signed-certificates-kyca8916/mac. (Specifically I checked the option for SSL)

Now I have a new error message. When I run gen3-client download-single --profile=icgc --guid=1083abca-0f6b-4da7-b852-0ccca0932b29 --no-prompt --skip-completed

I get the following:

2020/11/05 10:40:15 A new version of gen3-client is available! The latest version is 2020.11.0. You are using version 2020.10
2020/11/05 10:40:15 Please download the latest gen3-client release from https://github.com/uc-cdis/cdis-data-client/releases/latest
2020/11/05 10:40:18 0 files downloaded. in "original" mode, duplicated files under "./" will be overwritten
2020/11/05 10:40:18 1 files have encountered an error during downloading, detailed error messages are:
2020/11/05 10:40:18 Error occurred when getting download URL for object 1083abca-0f6b-4da7-b852-0ccca0932b29
Details of error: 503 Service Unavailable error has occurred! Please check backend services for more details

Any advice on how to fix this new error?

Hi @sdalin,

Great you found the instructions on how to add the SSL certificate!

We have contacted the developers and will get back to you regarding the new error. In the meantime, sometimes retrying the same command can help.

We would like to notify you that you should never publicly post presigned URLs as they pose a significant security thread to the data on the commons. Just a heads up, we will delete your forum entry to censor the presigned URLs.

Hi @xritter2,

Oops, didn't realize that!

I tried downloading a couple other files, but got the same 509 error. And now when I try with that same GUID that gave me the 503 error, I'm back to getting a 509 error.

I tried deleting and re-creating my self-signed certificate, but that didn't seem to help.

Any other ideas?

Hi @sdalin,

we published a new gen3-client release yesterday, could you try again with the new gen3-client version? Thank you for getting back to us!

Best wishes

Hi @xritter2,

I tried downloading the most recent release, and got the same 509 error. I tried making a new API key and re-creating my profile, as well as deleting and re-creating my self-signed certificate, but none of those things made a difference.

I'm starting to wonder if I don't have access to the correct files? Is it possible that is the issue? How would I check that? I have access to two projects, but I'm not sure how to check with files are part of which projects.

Thank you for your help!

Hi @sdalin,

thank you for your message.

  1. Can you see to which projects you have access to via gen3-client auth --profile=your-profile ?
  2. Can you check on https://www.ncbi.nlm.nih.gov/gap/ under your profile the access to the projects of interest?

Hi @xritter2,

When I run gen3-client auth --profile=my-profile, I see the following:

You have access to the following project(s) at https://icgc.bionimbus.org:
2020/11/05 12:52:06 phs000178 [read read-storage]
2020/11/05 12:52:06 phs000235 [read read-storage] 

I'm not totally which profile you're referring to, but in dbGaP, under Authorized Access/My Requests, the project I'm interested in is listed (I'm interested in some TCGA data).

When I'm looking for GUIDs, the files all have TCGA in the project ID.

So I'm pretty sure I have the correct access, but not certain because I keep getting these errors.

Hi @sdalin,
thank you! Looks like you have the necessary access. Regarding 509 and 503 errors, which one is it? Or do both happen?
I am currently awaiting getting access to reproduce the error myself, and will get back to you once I have access. Apologies, this may take a day or two.

Hi @xritter2,

Thank you for your help, I really appreciate it!

So far I've gotten the 503 error twice, and the 509 error every other time (probably ~20x). I'm very unsure about what was different about those two times that I got the 503 error. In fact, both times I got the 503 error, I ran the exact same command with the exact same GUID a couple minutes later and got a 509 error instead of a 503 error. I'm totally baffled.

Hi @sdalin,

this is bizarre since you added the self-signed certificate to your trusted certificates, you have access to the projects, your credentials are up to date, and both errors occur still when using the newest November's gen3-client release. Our devs mentioned a certificate authority package for linux, but you are using mac OS.
As of now, we cannot reproduce the error on our side. More questions for you:

  1. Do you use the gen3-client in a custom script?
  2. Do you see the files of interest you want to download on the Explorer page?
  3. Are you running the gen3-client on a VPN?
  4. Did you use the gen3-client before and did it work back then?
  5. Could you try a different computer on the same connection?

Hi @xritter2

To answer your questions:

  1. No, I'm just running single commands on the command line as shown by the tutorials.
  2. Yes. I'm interested in all the StSM and StGV data, so I clicked those two boxes and then downloaded the manifest. When I've been trying single files, I've been copying a GUID from the list in the explorer.
  3. I'm running the gen3-client from my own computer, but trying to save into a folder that I was accessing through a VPN. I tried downloading onto my own computer just now, and that didn't work either. The first time I tried I got the 503 error, and the second time I got the 509 error.
  4. I've never used the gen3-client before. I'm trying to find someone around here who has used it before who could just look over what I've done in case I'm missing an obvious step, but haven't found anyone yet.
  5. I'll try with my personal laptop later tonight.

Is it possible that I'm not creating the self-signed certificate correctly? Is there some information I should put into the certificate?

I believe our server is linux-based, but I access it via VPN. Would that help? I had originally tried doing this on the server but ran into issues (I've forgotten the problem by now, but could try again if that sounds more promising.)

Hi @sdalin,

thank you for your reply. At this point we would suggest to having the download take place on the same machine and not via VPN. If that doesn't work, try the version you mentioned with running everything on the server (all in linux). Please let us know if either worked.
Let me get back to you regarding the self-signed certificate.

Addition: Which browser do you use, and what does it say about the connection to icgc and the certificate? (You go on the commons URL, click the key lock in the top left and let show the details of the certificates)

Hi @xritter2,

Great news, the download mostly worked! I successfully downloaded one file, and then the large majority of the files in my manifest!

The only difference was using my personal laptop instead of my Broad laptop. I didn't even need to use the self-signed certificate. I'm even downloading it directly onto the server through the VPN.

I guess there's some settings on the other laptop that were causing the errors. I'll get in touch the the IT people because it seems as though a Broad laptop should be able to download this data!

There were 29 files that gave a 503 error: Details of error: 503 Service Unavailable error has occurred! Please check backed services for more details. I am trying to run the same download-multiple command again, with the --skip-completed flag. Hopefully they'll work on the second time.

Thank you again for all your help!

1 Like

Hi @sdalin,

this is great news! About the 503 error, it is possible that a lack of resource causes this time-load problem. Thank you for your helpful feedback, we will follow up on this error.
Addition: perhaps on your side the CA for the cleversafe url needs to be trusted.

Best wishes

Hi @xritter2,

You're right about the 503 error, when I tried again, the remaining files downloaded.

I have one more question: I would like to download the final_consensus_sv_vcfs_passonly.tcga.controlled.tgz indicated here: https://dcc.icgc.org/releases/PCAWG/consensus_sv. It says that it is in the PDC, but gives only an aws command to download. Is it possible to download with gen3?

Thank you!

Hi @sdalin ,

this is great news.

I checked for the records indicated on the webpage you provided. Since they're not indexed, you will need to use the aws command and you cannot use Gen3.

Best wishes