EDIT: I was not allowed to post this with all the links by the forum software, so the original (and full text with the links needed for this to make sense) is here:
I am doing a small demo project that will hopefully include a gen3 server running from compose-services to pull data stored in a VCF from gen3, do some analysis, and upload the output to a data type that I define in a new dictionary. If all goes well we would go after funding for a more permanent installation later on.
First, here is the current state of the world around the data dictionaries (DD), as far as I can tell:
- compose-services is configured by URL to point to a DD stored in s3 . This does not contain the VCF definition. Is this meant to be kept up to date? It has some SHAs in the '_settings', but I don't know if those identify a git commit, or what repo.
- compose-services can optionally use a DD stored as files in the same directory, and there is one checked into the repo . This version also does not contain VCF and is at least 14 months old.
-  shows a DD with a VCF, but is there a URL like  that I can configure my instance to pull from? Where are the source yaml files for this version?
- The uc-cdis/dataditionary repo  has no VCF.yaml, but does have some related types like 'submitted_somatic_mutation'. I am surprised that what I see at gen3.datacommons.io/dd does not match these files, but I don't know if they are actually supposed to match.
- The nci-gdc/gdcdictionary  has even more mutation-related types, and they have data_format=VCF, but nothing actually named 'VCF' like what I see in the dictionary browser in .
- I also can see that the uc-cdis and nci-gdc github repos have diverged greatly (the fork is " 111 commits ahead, 719 commits behind NCI-GDC:develop.")
At this point I'd be interested if there was a simple story of what is important and what isn't in the above list.
Otherwise, my real question is: what's the most up-to-date set of schemas that you would recommend to start with if I need to represent the simplest possible mutations that will originally be in a VCF file?
Thank you for any assistance.
(Links  -  available in the gist at the top of post)