Reference: SARS-CoV-2 instance
The API for SARS-CoV-2 uses all SARS-CoV-2 data on NCBI GenBank The sequences were pre-processed by Nextstrain.
The API has the following endpoints related to samples. These endpoints provide different types of data:
/sample/aggregated- to get summary data aggregated across samples
/sample/details- to get per-sample metadata
/sample/contributors- to get author names of the samples
/sample/aa-mutations- to get the common amino acid mutations
/sample/nuc-mutations- to get the common nucleotide mutations
/sample/fasta- to get original (unaligned) sequences
/sample/fasta-aligned- to get aligned sequences
The API returns a response (data) based on a query to one of the endpoints. You can view a response in your browser, or use the data programmatically.
To query an endpoint, use the web link with prefix
https://lapis.cov-spectrum.org/open/v1 and the suffix for the relevant endpoint. In the examples, we only show the suffixes to keep things simple, but a click takes you to the full link in your browser.
Query example: Get the total number of available sequences: /sample/aggregated
See Response format
We can adapt the query to filter to only samples of interest. The syntax for adding filters is
All sample endpoints can be filtered by the following attributes:
variantQuery (see Variant query)
fasta-aligned can additionally be filtered by these attributes:
To determine which values are available for each attribute, see the example in section “Aggregation”.
See Mutation filters
Pango lineage filter
Above, we used the
/sample/aggregated endpoint to get the total counts of sequences with or without filters. Using the query parameter
fields, we can group the samples and get the counts per group. For example, we can use it to get the number of samples per country. We can also use it to list the available values for each attribute.
fields accepts a comma-separated list. The following values are available: