Frequently Asked Questions
General
What is AraGeno?
The AraGeno service allows researchers to identify their Arabidopsis thaliana plants.
There are two distinct services that are provided:
- Identifying a plant: Users can upload the genotype data of a A.thaliana plant and we will identify the corresponding line
- Genotyping a sample: Users can send a samples of a plant and we will genotype them and additionally identify the line
How much does it cost?
Under construction
Identifying a sample
How does it work?
SNPmatch can genotype samples efficiently and economically using a simple likelihood approach.
This approach allows SNPmatch to genotype samples with as low as 5000 informative SNPs.
How long does it take?
SNPmatch is a python script taking about 18 seconds for the entire analysis, to compare 500000 SNPs to the 10.7 million markers in the reference database (1001 SNP matrix)
Which input files are supported?
Presently, AraGenotyper takes input files with two different formats, BED and VCF.
The bed file is a three column file with chr, position and genotype tab-separated.
An example is given below
The VCF file has a default format detailed in here. The main arguments SNPmatch requires is CHROM and POS in header and GT in the INFO column. Sample files for both BED and VCF format can be downloaded in the official SNPMatch github repository
1 125 0/0 1 284 0/0 1 336 0/0 1 346 1/1 1 353 0/0 1 363 0/0 1 465 0/0 1 471 0/1 1 540 0/0 1 564 0/0 1 597 0/0 1 612 1/1 1 617 0/1The SNPs can be even phased (0|0). Chromosome names such as Chr1, Chr2 instead of 1, 2 are also parsed.
The VCF file has a default format detailed in here. The main arguments SNPmatch requires is CHROM and POS in header and GT in the INFO column. Sample files for both BED and VCF format can be downloaded in the official SNPMatch github repository
Can I programatically identify my samples?
Yes, AraGeno provides a REST api for identifying your samples.
This allows you to use AraGeno from the your programs and scripts and also enables you to easily run multiple identifications at the same time.
Here is a sample cURL call:
The JSON object contains also url key that can be used to check the status and result of the running analysis.
This can be done by sending a GET request to that url (i.e.
This allows you to use AraGeno from the your programs and scripts and also enables you to easily run multiple identifications at the same time.
Start identification
To start a identify job, you need to send a POST request to the /api/identify endpoint. You need to provide firstname, lastname and email as form paramters and the file as a genotype paramterHere is a sample cURL call:
curl --request POST \ --url https://arageno.gmi.oeaw.ac.at/api/identify/ \ --form firstname=Jon \ --form lastname=Doe \ --form email=jon.doe@gmail.com \ --form 'genotype=@genotype_file.vcf'
Check analysis & retrieve results
The above call will return you a JSON object which contains information about the submitted genotype.The JSON object contains also url key that can be used to check the status and result of the running analysis.
This can be done by sending a GET request to that url (i.e.
https://arageno.gmi.oeaw.ac.at/api/identify/c5d35fd3-a315-4a52-935c-52e9cda38c12
)
curl --request GET \ --url https://arageno.gmi.oeaw.ac.at/api/identify/926ad585-cea1-44b8-8d48-0c547699f78f/ \
Delete submission
To delete a submission just send a DELETE request to the info endpoint (value of url key in above JSON object).curl --request DELETE \ --url http://localhost:8000/api/identify/59580110-3c09-46b0-ac38-ca617737a78f/ \
Genotyping a plant
How does it work?
Under construction
How much does it cost?
Under construction
How long does it take?
Under construction