Compliance

Safety and performance information

The confusion matrix of statistical classification is shown below with the abbreviations used in this document. 

 

Comparator method

Total

Positive

Negative

Test

Positive

TP

FP

TP+FP

Negative

FN

TN

FN+TN

No calls or invalid calls

E

F

E+F

 

Total

TP+FN+E

FP+TN+F

N

Please note that not all entries in the confusion matrix can be determined for all genomic variant types [1]. For example, theoretically, there are an unlimited number of Indel variants, which means that TN cannot be determined for SNV/Indel discovery. The performance characteristics that were determined for this device are: 

Performance characteristic 

Abbr.

Formula

Variant type

Alias

Positive Percent Agreement

PPA

TP/(TP+FN)

SNV/Indel, CNV 

Sensitivity 

Negative Percent Agreement 

NPA

TN/(TN + FP) 

CNV 

Specificity 

Technical Positive Predictive Value 

TPPV

TP/(TP + FP) 

SNV/Indel, CNV 

Precision 

Limit of Detection  

LoD

-

SNV/Indel, somatic 

-

Limit of Blank 

LoB

-

SNV/Indel, somatic 

-

Reproducibility and Repeatability 

-

-

SNV/Indel 

-

 

Detection of SNV/Indels in NGS data (germline) 

The analytical performance parameters for SNV and Indel discovery in germline samples are listed below (in %) in conjunction with the confidence intervals. 

Parameter 

Variant type 

WGS 

WES 

NGS panel 

PPA 

SNV 

99.8 

99.4 

99.1 

PPA 95% CI 

SNV 

99.8..99.8 

99.4..99.4 

98.9..99.3 

TPPV 

SNV 

99.8 

99.5 

99.8 

TPPV 95% CI 

SNV 

99.8..99.8 

99.5..99.6 

99.7..99.9 

PPA 

Indel 

97.1 

86.9 

71.9 

PPA 95% CI 

Indel 

97.1..97.2 

86.7..87.1 

69.3..74.4 

TPPV 

Indel 

95.9 

72.4 

56.9 

TPPV 95% CI 

Indel 

95.9..96.0 

72.1..72.7 

54.4..59.4 

PPA 

all 

99.5 

98.3 

96.7 

PPA 95% CI 

all 

99.4..99.5 

98.3..98.3 

96.4..97.0 

TPPV 

all 

99.2 

96.9 

95.0 

TPPV 95% CI 

all 

99.2..99.2 

96.8..96.9 

94.6..95.4 

 

Confidence intervals were calculated using the Wilson score method [2]. 

The performance was achieved with data sets of the following quality: 

Analyte

Depth of coverage

% bases covered at 20x

Insert size

On-target rate

WGS

> 90 

n/a

> 300 

n/a

WES

> 80 

> 96%

> 180

> 75% 

NGS panel

> 320

> 99% 

> 200 

> 75% 

 

Note that the analytical performance of NGS-based tests strongly depends on quality parameters like the depth of coverage and the selected target region [3]. 

It is therefore recommended to determine the performance characteristics for the particular NGS assay and laboratory process according to applicable guidelines [4]–[7]. 


Reproducibility and repeatability 

Repeatedly processing the same data set with the device yields identical results (100% repeatability). Reproducibility was assessed using replicates of the same DNA sample thereby keeping conditions like the type of sequencing instrument, the day of measurement, the operating conditions remain the same. This means that only the sequencing process of the DNA is repeated which represents a sampling of different NGS reads. 

This assessment of reproducibility therefore encompasses 

  • Variation in the preparation of DNA replicates (e. g. amount of DNA), 

  • Variation due to errors in the sequencing process, 

  • Variation due to different local optimization of alignment in each sampling. 

Under these conditions the PPA varied by 0.2% and the TPPV varied by 0.3%. 


Limitations 

The following limitations should be noted: 

  • Genome-in-a-bottle samples have been characterized using NGS technologies, i. e. that the reference data set is biased towards variants that can be easily discovered by NGS.  

  • In addition, the performance evaluation is restricted to the trusted regions of the GiaB data sets. Typically, these trusted regions represent those genomic regions that can easily be assessed using NGS.  

  • The analytical performance outside trusted regions should be expected to be lower than reported here [3]. 

  • The performance evaluation was performed within the manufacturer’s target regions plus 50bp around these regions. This is the region that is typically evaluated in practice. The performance reported here may be lower than when reporting limited to the manufacturer’s target regions. 

Detection of SNV/Indels in NGS data (somatic) 

The performance characteristics of the somatic analysis are provided in % in the table below. 

Parameter

Variant type

WGS

Genome build 

 

hg38 

Limit of blank 

 

10.0 

Limit of detection 

 

15.0 

PPA

SNV

97.7

PPA 95% CI 

SNV

97.7..97.8 

TPPV 

SNV

99.3 

TPPV 95% CI 

SNV

99.2..99.3 

PPA 

Indel

79.0

PPA 95% CI 

Indel

78.6..79.5 

TPPV 

Indel

93.8

TPPV 95% CI 

Indel

93.5..94.0 

PPA 

all

95.6 

PPA 95% CI

all

95.5..95.7 

TPPV

all

98.7

TPPV 9165% CI

all

98.7..98.8

 

These performance characteristics include both germline and somatic variants. The WGS data set, a mixture of two GiaB samples, had an average depth of coverage of 110. 

Detection of CNV in NGS data 


General 

The analytical performance does not depend on the genome build. hg38 or GRCh37 provide equivalent performance.  

The CNV discovery of this device presumes that CNVs and pathogenic CNVs in particular are rare. CNVs that are present in 1 out of a pool of 5 samples or in 4 out of a pool of 10 can be detected with high sensitivity. This must be considered when analyzing data from families. CNVs with high prevalence or polymorphisms cannot reliably be detected with the device. 

Since the performance characteristics of a specific NGS assay depend on the configuration of the assay and the laboratory process (e. g. sequencing capacity, depth of coverage, uniformity across samples), it is recommended to perform an assay-specific validation according to applicable guidelines [4]–[7]. 


Performance evaluation using public patient data 

A performance assessment using the public data set ICR96 [8] comprising 96 patient samples is provided here. The data set is publicly available and may serve as a benchmark between different devices.  

Note that the ICR96 data set has been characterized as “less homogeneous” than other data sets by the reviewers of the publication. It should be expected that a results with a more modern laboratory set up can generate data of greater uniformity and therefore better analytical performance. The results reported here therefore represent a lower limit. 

The quality thresholds are: 

Parameter

Description

Threshold

Reference spread (RefSpread)

Quality value assigned to a single target region 

< 0.2

Bivariance (bivar) 

Quality value assigned to a data set as a whole

< 0.2

 

The following table summarizes the performance characteristics from the ICR96 data set: 

Name 

Loss 

Gain 

All 

PPA 

98.8% 

97.5% 

98.4% 

PPA 95% CI 

98.82..98.82% 

97.53..97.53% 

98.40..98.40% 

NPA 

99.9% 

100.0% 

99.8% 

NPA 95% CI 

99.86..99.86% 

99.96..99.96% 

99.82..99.82% 

The following table indicates how many samples and how many individual target regions were excluded from the analysis (no-calls) after application of the quality thresholds: 

Parameter 

Result 

Fraction 

Total number of samples 

96 

100.0% 

Samples failing thresholds 

10 

10.4% 

Total number of targets 

31203 

100.0% 

Targets failing thresholds 

417 

1.3% 

 

The choice of reference genome (hg38 vs GRCh37) has no significant impact on the overall analytical performance regarding the data set. It should be noted, though, that the analysis using hg38 enhances the mapability of NGS reads and leads to a larger number of samples being acceptable for analysis. Partial deletions or mosaic variations were not assessed. 

References 

[1] P. Krusche et al., “Best practices for benchmarking germline small-variant calls in human genomes,” Nat Biotechnol, vol. 37, no. 5, Art. no. 5, May 2019, doi: 10.1038/s41587-019-0054-x. 

[2] R. G. Newcombe, “Improved confidence intervals for the difference between binomial proportions based on paired data,” Statistics in Medicine, vol. 17, no. 22, pp. 2635–2650, Nov. 1998, doi: 10.1002/(SICI)1097-0258(19981130)17:22<2635::AID-SIM954>3.0.CO;2-C. 

[3] M. H. Cleveland, J. M. Zook, M. Salit, and P. M. Vallone, “Determining Performance Metrics for Targeted Next-Generation Sequencing Panels Using Reference Materials,” The Journal of Molecular Diagnostics, vol. 20, no. 5, pp. 583–590, Sep. 2018, doi: https://doi.org/10.1016/j.jmoldx.2018.04.005. 

[4] C. Rehder et al., “Next-generation sequencing for constitutional variants in the clinical laboratory, 2021 revision: a technical standard of the American College of Medical Genetics and Genomics (ACMG),” Genetics in Medicine, pp. 1–17, Apr. 2021, doi: 10.1038/s41436-021-01139-4. 

[5] E. Souche et al., “Recommendations for whole genome sequencing in diagnostics for rare diseases,” Eur J Hum Genet, pp. 1–5, May 2022, doi: 10.1038/s41431-022-01113-x. 

[6] G. Matthijs et al., “Guidelines for diagnostic next-generation sequencing,” Eur J Hum Genet, Oktober 2015, doi: 10.1038/ejhg.2015.226. 

[7] P. Bauer, “S1 Leitlinie: Molekulargenetische Diagnostik mit Hochdurchsatz-Verfahren der Keimbahn, beispielsweise mit Next-Generation Sequencing,” medgen, vol. 30, no. 2, pp. 278–292, Jun. 2018, doi: 10.1007/s11825-018-0189-z. 

[8] S. Mahamdallie et al., “The ICR96 exon CNV validation series: a resource for orthogonal assessment of exon CNV calling in NGS data,” Wellcome Open Res, vol. 2, p. 35, May 2017, doi: 10.12688/wellcomeopenres.11689.1.