I have been using gen3 to download .bam files onto my AWS cloud instances for analysis. When looking at my bill, I noticed that I am incurring a bit of AWS Egress fees even though I am downloading data onto my cloud instances (~0.8% of the data I have downloaded/ingressed using Gen3).
Is this data egress due to some acknowledgement/verification protocol that is running in the background of the gen3-client?
Is it possible for me to avoid or reduce this egress fee during data download using the gen3-client into my AWS instances?
Hi, ktan8! Thanks for reaching out. Can you help me better understand what you mean when you say "~0.8% of the data I have downloaded/ingressed using Gen3"? Do you mean that only 0.8% of the data you have downloaded from your Gen3 instance has incurred an egress charge?
Thanks for your reply. I have been trying to download a fair bit of data through the Gen3 client (e.g. ~100 TB) through an AWS instance for analysis.
However, due to the nature of the network protocol (TCP I presume), a request has to be made from my AWS instance to the Gen3 data server before the Gen3 then sends the requested data back to my AWS instance. Somehow this request is made very frequently such that it ends up to be ~0.8% of the total data I try to download. Thus, when I try to download 100TB of data from the Gen3 server to my AWS instance, my AWS instance will send out 800 GB (0.8% * 100 TB) of requests to the Gen3 data server. This 800 GB of request that I send out to the Gen3 data server is treated by AWS as an "Egress charge". I was not charged for the 100 TB of Ingress into AWS.
However, this 800GB of request charges quickly adds up, and can cost quite a bit over time. I therefore wonder if there's a way to modify the network protocol (e.g. packet size??) such that we can minimize the request made by the gen3-client to the data server? With that, I would not have to pay as much for the egress.
Thank you so much for your help and for checking with your team!
Hi, ktan8, thanks for your patience! I have not been able to determine whether there is a way to modify the network protocol to reduce your charges. However, I was able to collect a couple other ideas for reducing your egress charges:
Leverage discounts from Internet2: If your institution is a part of Internet2, you should reach out to Internet2 or find an AWS Reseller who will offer them Internet2 terms, which includes a decent egress discount.
If I hear of other strategies, I will add to this conversation.
However -- you may want to ask the Gen3 community what strategies they have found successful, and get a crowd-sourced answer. For that -- I invite you to join our community Slack channel, filled with other Gen3 users with plenty of varied experience. You can request access to join through the Google form linked there.