li reference genome: In #[a(target="_blank" href="https://docs.google.com/document/d/154GBOixuZxpoPykGKcPOyrYUcgEXVe2NvKx61P4Ybn4/edit") Beacon 0.2 specifications], the values expected for human data is of GRCh? format, whereas arrayMap uses hg?? nomenclature.
li allele: In #[a(target="_blank" href="https://docs.google.com/document/d/154GBOixuZxpoPykGKcPOyrYUcgEXVe2NvKx61P4Ybn4/edit") Beacon 0.2 specifications], the parameter is required. However, it does not make sense at all for arrayMap (in which no sequence related data exists).
li variantClass: This is a new parameter. The value equals DUP corresponds to SEGTYPE=1 (gain) in arrayMap, while DEL is for SEGTYPE=-1 (loss). The parameter is mandatory.
li sampleuid: This is DEBUG info.
li matchedSegment: This is DEBUG info.
br
h5 Example
p Query chromosome 11 at position 34439881 for the dataset 8070/3, showing a deletion: #[a(target="_blank" href="/v0.2/query?chromosome=11&position=34439881&dataset=8070/3&variantClass=DEL") query?chromosome=11&position=34439881&dataset=8070/3&variantClass=DEL]
br
h5 Outcome from the meeting with Jordi (Feb, 10, 2016):
p It chould be ok to have the variantClass in Beacon v0.3. As for ranges, this is more complicated and requires lots of discussions. We will probably go in 2 steps:
ul
li add P (duPlication) to the list of allowed values for the allele parameter.
li modify I (insertion) slightly by making the "sequence" string optional, i.e. be fuzzy about what was inserted.
p and then
ul
li proper variantClass
li ranges (defined as start/stop position, e.g. [200000,30000] #[span(style="font-weight: bold;") or] start and length, e.g. 20000:100000)
th(style='width: 25%') Required in Beacon-arrayMap
th(style='width: 25%') Comment
tr
td exists
td yes
td yes
td value=overlap
tr
td alleles
td no
td no
td
tr
td observed
td no
td no
td
tr
td info
td no
td yes
td value=ok
tr
td error
td no
td yes
td value=null
h4 Forthcoming implementation: Open questions
br
h5 Imprecise structural variants:
div
p Exact positions are definitely not suitable for arrayMap. We should be able to request for ranges rather than positions. However, the issue with using ranges is that the start and end positions are imprecise in arrayMap.
div(class="alert alert-info") In the VCF, there is a field for the symbolic alternate alleles for imprecise structural variants in the meta-information lines.<br>See #[a(target="_blank" href="https://samtools.github.io/hts-specs/VCFv4.2.pdf") the VCF 4.2 specifications] for more details.
li In #[a(target="_blank" href="https://docs.google.com/document/d/154GBOixuZxpoPykGKcPOyrYUcgEXVe2NvKx61P4Ybn4/edit") Beacon 0.2 specifications], 'variants' is a mandatory property of #[span(style="font-weight: bold;") DataSizeResource] and its value was arbitrary set to -1. Now, we have to decide whether we should:
ol
li stick to -1.
li use the property 'variants' to count the number of segments (for instance).
li ask the Beacon team if we can make this optional property.
li ask the Beacon team if we can change the property name.
li In #[a(target="_blank" href="https://docs.google.com/document/d/154GBOixuZxpoPykGKcPOyrYUcgEXVe2NvKx61P4Ybn4/edit") Beacon 0.2 specifications], 'reference' is a mandatory property of #[span(style="font-weight: bold;") DataSetResource]. The problem is that, in arrayMap, the genome version is linked to a given sample, and not to a cancer tissue type. Moreover, there can be more than one genome version in a sample. Now, we have to decide whether we should:
ol
li assign it a fake default value.
li assign it the first reference genome's value of the data structure.
li ask the Beacon team if they can make this optional property.
li ask the Beacon team if they can replace the property type by a list of strings (instead of a string).
li clarify CIPOS and CIEND in VCF v4.2 for DUP/DEL (i.e. CIPOS=-500,500;CIEND=-500,500)
div(class="alert alert-info") The Beacon 0.3 specifications don't include any of the above proposals.