Skip to content

Steps

The analysis involves two types of steps: Annotation steps and Filter steps.

  • Annotation steps are designed to provide additional information about the variants, which can come from multiple sources such as VCF files or factories.
  • Filter steps used to narrow down the variants based on various criteria, including the annotations.

Each step contains a list of step records, which correspond to the list of variants available within the step.

Combining annotation step and filter steps

If an annotation step is added after a filter step, only the variants that passed through the filter will be annotated. This optimization improves the performance of the annotation step, which can take a very long time.


Annotation Steps

When creating an analysis, it is possible to add Annotation Steps to include annotations generated by factories for the variant. Those annotations will be saved as metadata and linked to each variant as a StepRecord object. For each unique annotation, a new column will be added to the analysis. The user will then be allowed to filter on those annotations.

Annotation Step Zero

Step Zero of the annotation pipeline is responsible for generating metadata from VCF file annotations.

MetadataBuilder

The MetadataBuilder class is used to:

  1. build the metadata on the AnalysisRecord, by extracting values from the corresponding AnalysisRecord and its related models such as SampleRecord and Variant.
  2. build Step Zero, containing VCF file annotations.

It takes two parameters:

  • an Analysis instance, whose columns will be used to build the metadata dictionaries
  • a list of AnalysisRecord instances for which we want to build the metadata.

build()

The build() method is responsible for calling the two methods that build the metadata dictionaries and the StepRecords.

_build_metadata()

The _build_metadata() method loops through each AnalysisRecord and calls the _get_analysisrec_metadata() method to format data from the record and its related models to a dictionary. The AnalysisRecords are then updated in bulk.

_build_step_zero()

The _build_step_zero() method loops through each AnalysisRecord and retrieves all VCF annotations. It creates a StepRecord object with metadata containing all the VCF annotations and finally saves all the StepRecord objects in bulk.

Example usage

1
2
3
4
    analysis = Analysis(...)
    analysis_records = [AnalysisRecord(...), AnalysisRecord(...), ...]
    builder = MetadataBuilder(analysis, analysis_records)
    builder.build()

Factory Annotation steps

Additional annotation steps may be created in order to add more variant informations coming from factories.

The following parameters need to be specified:

  • Parent Step: The first step must inherit to be the Step 0 built by MetadataBuilder.
  • NodeMaker: It will add factory annotations based on the Node specified in the NodeMaker.

Note that if the Variant/Node combination is unknown in the database, the association will be created, which may take some time.


Filter Steps

Filter steps are used to narrow down the variants in the analysis based on one or multiple conditions. Each filter step is associated with a "Filter" object that contains a JSON representation of the conditions. This JSON object is used to construct a complex query in the database. When the database query is executed, the variants that meet the query conditions are stored as new step record objects. This allows for identifying which variants successfully passed through the filter step.

Condition and Operator

The Condition and Operator models are designed to facilitate the construction and representation of the JSON Filter object in an class-based manner. These models provide a way to express conditions and operators (such as AND, OR and XOR) that can be combined to form a database query. In the analysis context, the JSON object of the Filter step is parsed as an Operator object, which serves for constructing the query.

QueryBuilder

The QueryBuilder class creates ORM queries using Condition and Operator objects. The get_query method iterates over the children of the root Operator object, to construct a query object (Q). The _get_condition method fetches the column of the analysis that correspond to a given condition, in order to get the complete field path.

Example usage :

1
2
3
4
    analysis = Analysis(...)
    operator = Operator(...)
    query = QueryBuilder(analysis_id=analysis.id).get_query(filter=operator)
    results = analysis.analysis_records.filter(query)

Applying a Filter to an analysis

Endpoint: PATCH /variant_analyses/analysis/{id}/

The PATCH method of the AnalysisViewset class is responsible for handling the API variant-filtering behavior.\ This endpoint saves a filter and associates it with an analysis. It will be then used in the records API to return the filtered records. The filter data payload is expected to be in JSON format and should follow the structure defined by the FilterOperatorSerializer.

Endpoint: GET /variant_analyses/analysis/{id}/records

This endpoint returns records (=variants) associated with a specific analysis matching the logic of the filter that is applied to the analysis.\ If there is no filter associated with the analysis, this enpoint will return all records.

FilterOperatorSerializer

The FilterOperatorSerializer is used by the API to validate and transform a JSON filter into a structured tree of Operator and Condition objects. This serializer is only used to construct a query to associate the filter with the analysis, but not to save the filter as a named filter !

Example usage
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
PATCH /variant_analyses/analysis/123/

{
  "filter": {
    "type": 0,
    "val": "and",
    "negation": false,
    "children": [
      {
        "type": 1,
        "negation": true,
        "val": {
          "column": "chrom",
          "lookup": "exact",
          "val": "chr5"
        }
      },
      {
        "type": 0,
        "val": "or",
        "negation": false,
        "children": [
          {
            "type": 1,
            "negation": false,
            "val": {
              "column": "infos.qd",
              "lookup": "lt",
              "val": 17
            }
          },
          {
            "type": 1,
            "negation": false,
            "val": {
              "column": "alt",
              "lookup": "gt",
              "val": "A"
            }
          }
        ]
      }
    ]
  }
}

Saved Filter API

Endpoint: /variant_analyses/filter/

The FilterViewset class is responsible for handling the saving of a named filter through the API. Sending a POST request to this endpoint allows to save the JSON query filter associated with a name. Sending a GET request will list all available filters.

FilterSerializer

This serializer is used by this API to validate a JSON filter. It expects two params :

  • "name" : the name of the filter
  • This will be used to create the key with which the filter can be invoked. The name will be transformed according to these rules:
    • Uppercase will be transformed to lowercase
    • Accented letters will be replaced with unaccented letters
    • Underscores and spaces will be deleted
    • Example: Name: Variations d'intérêt Key: variationsdinteret
  • "filter" : the filter query in the form of a JSON Operator object. It should follow the same structure defined by the FilterOperatorSerializer.

Example usage :

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
POST /variant_analyses/filter/

{
  "name": "my super filter",
  "filter: {
    "type": 0,
    "val": "and",
    "negation": true,
    "children": [
      {
        "type": 1,
        "negation": false,
        "val": {
          "column": "chrom",
          "lookup": "exact",
          "val": "chr5"
        }
      },
      {
        "type": 0,
        "val": "xor",
        "negation": false,
        "children": [
          {
            "type": 1,
            "negation": true,
            "val": {
              "column": "infos.qd",
              "lookup": "lt",
              "val": 17
            }
          },
          {
            "type": 1,
            "negation": false,
            "val": {
              "column": "alt",
              "lookup": "gt",
              "val": "A"
            }
          }
        ]
      }
    ]
  }
}

Last update: October 4, 2023