CTS2-LE Supported Terminologies: Unterschied zwischen den Versionen

Aus CTS2-LE
Zur Navigation springen Zur Suche springen
Zeile 1: Zeile 1:
 
+
            <h2 id="preface">Preface</h2>
              <h1 id="loading-standard-terminologies">Loading Standard Terminologies</h1>
 
 
<p>Due to license policies of standard terminology providers we do not make available provider input files. Customers have to download these files from provider sites.</p>
 
<p>Due to license policies of standard terminology providers we do not make available provider input files. Customers have to download these files from provider sites.</p>
<p>To load these standard terminologies the customer has to copy the input files to a dedicated directory (called <strong>LD</strong> in the following) together with a specification json file (<strong>SF</strong>). In context of docker, kubernetes etc. a dedicated volume should be used.</p>
+
<p>To load a standard terminology the customer has to copy the input files to a dedicated directory (called <strong>LD</strong> in the following) together with a specification json file (<strong>SF</strong>). In context of docker, kubernetes etc. a dedicated volume should be used. To avoid inconsistencies, services should not be accessed during load.</p>
<p>Currently, the following constraints have to be fulfilled:</p>
+
<h2 id="terminology-directory-ld">Terminology Directory (<strong>LD</strong>)</h2>
 +
<p>Directory <strong>LD</strong> contains all files required for a specific standard terminology. There should be a separate directory for each terminology (currently Snomed, Loinc, Mesh, and BfArM).</p>
 +
<p>The deployment for Kubernetes provides an extra volume with the path</p>
 
<ul>
 
<ul>
<li>the cts2le instance
+
<li><code>/etc/webcts2le/inst/terminologies</code></li>
<ul>
 
<li>should not contain the terminology to load</li>
 
<li>services cannot be accessed during load</li>
 
 
</ul>
 
</ul>
</li>
+
<p>Subdirectories within this path should be used for <strong>LD</strong>, which is specified when starting the load (see [[#rest-interface|REST interface for loading]]).</p>
</ul>
+
<h2 id="standard-terminologies">Standard Terminologies</h2>
<h2 id="specification-file-sf">Specification File (<strong>SF</strong>)</h2>
+
<h3 id="specification-file-sf">Specification File (<strong>SF</strong>)</h3>
<pre>
+
<syntaxhighlight lang="json" line>
 
{
 
{
     "terminologyDesignator": "<regex: 'icd-alpha|ops|hl7Fhir|mesh|ucum|loinc|snomed'>",
+
     "terminologyDesignator": "<regex: 'mesh|loinc|snomed'>",
 
     // usual version string
 
     // usual version string
 
     "version": "<string>",
 
     "version": "<string>",
Zeile 28: Zeile 26:
 
     ]
 
     ]
 
}
 
}
</pre>
+
</syntaxhighlight>
<p>!!! In case of designator <code>hl7Fhir</code> the version, groupName and resourceId property is defined by the standard itself and defining these properties has no effect.</p>
+
<h3 id="supported-standard-terminologies">Supported Standard Terminologies</h3>
<h2 id="rest-interface">REST interface</h2>
+
<syntaxhighlight lang="json" line>
<pre>
+
{
GET <HOST>/WebCts2LE/service/crud/bulk/load-std-terminology
+
  "terminologyDesignator": "snomed",
</pre>
+
  "version": "20250515",
<h3 id="query-parameters">Query Parameters</h3>
+
  "groupName": "snomed-ct",
<ul>
+
  "resourceId": "Snomed-20250515",
<li><code>directory</code>: direcory path (<strong>LD</strong>)</li>
+
  // following files usually are located at 'Snapshot/Terminology' within the snomed zip file
<li><code>loadSpec</code>: path to specification file <strong>SF</strong> (relative to <strong>LD</strong>)</li>
+
  // it is required that the order of the files is stated as below, i.e.
</ul>
+
  "defaultLanguage": "de",
<h3 id="note">Note</h3>
+
  "files": [
<p>Afterwards an update of the suggester (used by the navigator) has to be performed:</p>
+
    "sct2_Description_Snapshot_GermanyEdition_20250515.txt",
<pre>
+
    "sct2_Concept_Snapshot_GermanyEdition_20250515.txt",
GET <HOST>/WebCts2LE/service/manage/index/suggester/update
+
    "sct2_Relationship_Snapshot_GermanyEdition_20250515.txt",
</pre>
+
    "sct2_StatedRelationship_Snapshot_GermanyEdition_20250515.txt",
<h3 id="example">Example</h3>
+
    "sct2_TextDefinition_Snapshot_GermanyEdition_20250515.txt"
<pre>
+
  ]
GET <HOST>/WebCts2LE/service/crud/bulk/load-std-terminology?directory=terminologies/loinc&loadSpec=load-spec-2.71.json
+
}
</pre>
+
</syntaxhighlight>
<h1 id="supported-terminologies">Supported Terminologies</h1>
+
<syntaxhighlight lang="json" line>
<pre>
 
 
{
 
{
     "terminologyDesignator": "icd-alpha",
+
     "terminologyDesignator": "loinc",
     "version": "2021",
+
     "version": "2.80",
     "groupName": "ICD",
+
     "groupName": "loinc-tree",
     "resourceId": "Icd-2021",
+
     "resourceId": "Loinc-2.80",
 +
    // defines the corr. display name
 +
    "defaultLanguage" : "de",
 +
    // the following linguistic variants must be exist at directory 'AccessoryFiles/LinguisticVariants',
 +
    "linguisticVariants": [
 +
        {"lang": "de", "file": "Loinc_2.80/AccessoryFiles/LinguisticVariants/deDE15LinguisticVariant.csv"},
 +
        {"lang": "at", "file": "Loinc_2.80/AccessoryFiles/LinguisticVariants/deAT24LinguisticVariant.csv"}
 +
    ],
 
     "files": [
 
     "files": [
         // file 1, 2 must be the icd xml and the alphaid txt, respectively
+
         // file 1, 2 must be the hierarchy csv and the table csv, respectively
         "icd10gm2021syst_claml_20200918_20201111.xml",
+
        // usually file 1 is at 'AccessoryFiles/MultiAxialHierarchy/' and file 2 at
         "icd10gm2021_alphaid_edvtxt_20201002.txt"
+
        // 'LoincTable/' in the providers zip file, e.g. 'loinc271.zip'
 +
         "Loinc_2.80/AccessoryFiles/ComponentHierarchyBySystem/ComponentHierarchyBySystem.csv",
 +
         "Loinc_2.80/LoincTable/Loinc.csv"
 
     ]
 
     ]
 
}
 
}
</pre>
+
</syntaxhighlight>
<pre>
+
<syntaxhighlight lang="json" line>
 
{
 
{
     "terminologyDesignator": "ops",
+
     "terminologyDesignator": "mesh",
     "version": "2021",
+
     "version": "2025",
     "groupName": "OPS",
+
     "groupName": "mesh",
     "resourceId": "Ops-2021",
+
     "resourceId": "Mesh2025",
 
     "files": [
 
     "files": [
         "ops2021syst_claml_20201016.xml"
+
         "desc2025.xml"
 
     ]
 
     ]
 
}
 
}
 +
</syntaxhighlight>
 +
<h2 id="bfarm-terminologies">BfArM Terminologies</h2>
 +
<p>BfArM (Bundesinstitut für Arzneimittel und Medizinprodukte) provides the standard terminologies for Germany. To load these standard terminologies the customer has to create a dedicated directory (called <strong>LD</strong> in the following) together with a specification json file (<strong>SF</strong>). In context of docker, kubernetes etc. a dedicated volume should be used.</p>
 +
<h3 id="packages">Packages</h3>
 +
<p>The packages downloaded from bfarm must be located in a dedicated directory (e.g. <code>bfarm</code>). The following structure is an example for two packages (ICDGM, OPS).</p>
 +
<pre>
 +
bfarm
 +
|_ packages
 +
|  |_ bfarm.terminologien.icd10gm-2025.0.0.tar.gz
 +
|  |  |_ package/CodeSystem-icd10gm-agelow-2025.json
 +
|  |  |_ package/CodeSystem-icd10gm-agereject-2025.json
 +
|  |  |_ package/package.json
 +
|  |  |_ ...
 +
|  |_ bfarm.terminologien.ops-2025.0.0.tar.gz
 +
|  |  |_ ...
 +
|_ fhir-packs.jsonc
 
</pre>
 
</pre>
<pre>
+
<h3 id="specification-file-sf-1">Specification File (<strong>SF</strong>)</h3>
 +
<p>for BfArM terminologies, the following specification file is used.</p>
 +
<syntaxhighlight lang="json" line>
 
{
 
{
     "terminologyDesignator": "ucum",
+
     "terminologyDesignator": "fhir-package",
     "version": "2017",
+
     "canonicalUrlRegex": "<regex>", // optional
     "groupName": "UCUM",
+
     "packageRegex": "<regex>"
    "resourceId": "Ucum-2017",
 
    "files": [
 
        "ucum.tsv"
 
    ]
 
 
}
 
}
</pre>
+
</syntaxhighlight>
<pre>
+
<ul>
 +
<li>
 +
<p><code>terminologyDesignator</code></p>
 +
<ul>
 +
<li>has to be set to <code>fhir-package</code></li>
 +
</ul>
 +
</li>
 +
<li>
 +
<p><code>canonicalUrlRegex</code></p>
 +
<ul>
 +
<li>this filter loads only terminologies whose <em>canonical URL</em> (<code>https://hl7.org/fhir/R4/datatypes.html#canonical</code>) conforms to <code>&lt;regex&gt;</code>. E.g., regex <code>.*(agerejec|agelow).*</code> will only load the terminologies <code>CodeSystem-icd10gm-agereject-2025.json</code> and <code>CodeSystem-icd10gm-agelow-2025.json</code> because its canonical URLs are
 +
<ul>
 +
<li><code>https: //terminologien.bfarm.de/fhir/CodeSystem/icd10gm-agereject|2025</code> and</li>
 +
<li><code>https: //terminologien.bfarm.de/fhir/CodeSystem/icd10gm-agelow|2025</code>, respectively.</li>
 +
</ul>
 +
</li>
 +
</ul>
 +
</li>
 +
<li>
 +
<p><code>packageRegex</code></p>
 +
<ul>
 +
<li>this filter loads only packages whose <em>canonical package name</em> conforms to <code>&lt;regex&gt;</code>. The <em>canonical package name</em> ist defined as the form <code>&lt;name&gt;|&lt;version&gt;</code> where <code>name</code> and <code>version</code> are the properties in the package definition file
 +
<ul>
 +
<li><code>bfarm/packages/bfarm.terminologien.icd10gm-2025.0.0.tar.gz/package/package.json</code> (see section Packages above).</li>
 +
<li>E.g., regex <code>.*(icd10gm\\|2025|ops\\|2025).*</code> will only load the ICD and OPS package.</li>
 +
</ul>
 +
</li>
 +
</ul>
 +
</li>
 +
</ul>
 +
<h3 id="example">Example</h3>
 +
<syntaxhighlight lang="json" line>
 
{
 
{
     "terminologyDesignator": "loinc",
+
     "terminologyDesignator": "fhir-package",
     "version": "2.71",
+
     "canonicalUrlRegex": ".*(agerejec|exotic|einmalk).*",
    "groupName": "LOINC",
+
     "packageRegex": ".*(icd10gm\\|2025|ops\\|2025).*"
    "resourceId": "Loinc-2.71",
 
     "files": [
 
        // file 1, 2 must be the hierarchy csv and the table csv, respectively
 
        // usually file 1 is at 'AccessoryFiles/MultiAxialHierarchy/' and file 2 at
 
        // 'LoincTable/'
 
        // in the providers zip file, e.g. 'loinc271.zip'
 
        "MultiAxialHierarchy.csv",
 
        "Loinc.csv"
 
    ]
 
 
}
 
}
 +
</syntaxhighlight>
 +
<p>In this example only the terminologies for age rejection and the exotic one of the ICD as well as the one-time codes of the OPS package will be loaded.</p>
 +
<h2 id="rest-interface">REST interface</h2>
 +
<pre>
 +
GET HOST/WebCts2LE/service/crud/bulk/load-std-terminology
 
</pre>
 
</pre>
 +
<h3 id="query-parameters">Query Parameters</h3>
 +
<ul>
 +
<li><code>directory</code>: direcory path (<strong>LD</strong>)</li>
 +
<li><code>loadSpec</code>: path to specification file <strong>SF</strong> (relative to <strong>LD</strong>)</li>
 +
</ul>
 +
<h3 id="example-1">Example</h3>
 
<pre>
 
<pre>
{
+
GET HOST/WebCts2LE/service/crud/bulk/load-std-terminology?directory=/etc/webcts2le/inst/terminologies/loinc&loadSpec=load-spec-2.80.json
    "terminologyDesignator": "hl7Fhir",
 
    // version, groupName, and resource id is automatically generated !
 
    "files": [
 
        "valuesets.xml",
 
        "v3-codesystems.xml"
 
    ]
 
}
 
 
</pre>
 
</pre>
 +
<h3 id="note">Note</h3>
 +
<p>Afterwards an update of the suggester (used by the navigator) has to be performed:</p>
 
<pre>
 
<pre>
{
+
GET HOST/WebCts2LE/service/manage/index/suggester/update
    "terminologyDesignator": "snomed",
 
    "version": "2021-07",
 
    "groupName": "SNOMED",
 
    "resourceId": "Snomed-2021-07",
 
    // following files usually are located at 'Snapshot/Terminology' within the snomed zip file
 
    "files": [
 
        "sct2_Description_Snapshot-en_INT_20210731.txt",
 
        "sct2_Concept_Snapshot_INT_20210731.txt",
 
        "sct2_Relationship_Snapshot_INT_20210731.txt",
 
        "sct2_StatedRelationship_Snapshot_INT_20210731.txt",
 
        "sct2_TextDefinition_Snapshot-en_INT_20210731.txt"
 
    ]
 
}
 
 
</pre>
 
</pre>
<h1 id="notes">Notes</h1>
+
<h2 id="notes">Notes</h2>
<h2 id="removal">Removal</h2>
+
<h3 id="removal">Removal</h3>
<p>Removal time is nearly equal load time. It is recommended that the terminology is not contained in the store.</p>
+
<p>Removal time is nearly equal load time due to the RDF store. It is recommended that the terminology is not present in the store beforehand.</p>
<h2 id="loading-time">Loading Time</h2>
+
<h3 id="loading-time">Loading Time</h3>
 
<p>Due to used RDF quad store technology the loading time (i.e. 'weaving' the RDF-triple knowledge graph based on the flat files) on a notebook (~2 GHz throttled, 16 GB RAM) is:</p>
 
<p>Due to used RDF quad store technology the loading time (i.e. 'weaving' the RDF-triple knowledge graph based on the flat files) on a notebook (~2 GHz throttled, 16 GB RAM) is:</p>
 
<ul>
 
<ul>
 
<li>Smomed: ~90 min</li>
 
<li>Smomed: ~90 min</li>
 
<li>Loinc: ~45 min</li>
 
<li>Loinc: ~45 min</li>
<li>icd/alphaid: ~20 min</li>
 
 
<li>MeSH: ~15 min</li>
 
<li>MeSH: ~15 min</li>
<li>FHIR: ~25 min (~1500 single CS/VS files)</li>
 
<li>ops: ~8 min</li>
 
<li>ucum: &lt;1 min</li>
 
 
</ul>
 
</ul>
 
<p>It is recommended in the context of docker, kubernetes, etc. to define high CPU/RAM resources for the <em>fuseki</em> container to decrease loading time.</p>
 
<p>It is recommended in the context of docker, kubernetes, etc. to define high CPU/RAM resources for the <em>fuseki</em> container to decrease loading time.</p>
 
<p>It is a known restriction for quad stores that loading time is high compared to other nosql data stores. On the other hand it offers elaborated functionality based on semantic web technologies. Future versions of CTS2-LE could utilize nosql data stores.</p>
 
<p>It is a known restriction for quad stores that loading time is high compared to other nosql data stores. On the other hand it offers elaborated functionality based on semantic web technologies. Future versions of CTS2-LE could utilize nosql data stores.</p>
<h2 id="database-space">Database Space</h2>
+
<h3 id="database-space">Database Space</h3>
<p>Quad stores usually occupies huge space on disk because every quad is indexed. E.g. Snomed has ~18.000.000 quads and the quad store <em>fuseki</em> occupies ~50 GB directly after loading. But <em>fuseki</em> offers a compaction of the database to ~8 GB with the call</p>
+
<p>Quad stores usually occupies huge space on disk because every quad is indexed. E.g. Snomed has ~18.000.000 quads and the quad store <em>fuseki</em> can occupies ~50 GB after loading. But <em>fuseki</em> offers a compaction of the database to ~8 GB with the call</p>
 
<pre>
 
<pre>
java -cp fuseki-server.jar tdb2.tdbcompact [--help] --loc=<DB>
+
curl --request POST --url 'http://localhost:3030/$/compact/cts2le?deleteOld=true'
 
</pre>
 
</pre>
<p>where <strong>DB</strong> is the disk location of the database, in docker or kubernetes context the path <code>/etc/fuseki/apache-jena-fuseki-3.8.0/run/databases/cts2le</code> (for details see <a href="https://jena.apache.org/documentation/tdb2/tdb2_cmds.html">https://jena.apache.org/documentation/tdb2/tdb2_cmds.html</a>, keyword <code>tdb2.tdbcompact</code>). !!! Note that the <em>fuseki</em> container must not be running.</p>
+
 
<p>Future version of CTS2-LE will provide compaction via a REST call.</p>
+
            <script async src="https://cdn.jsdelivr.net/npm/katex-copytex@latest/dist/katex-copytex.min.js"></script>
</div>
 
        </div>
 

Version vom 11. Dezember 2025, 13:17 Uhr

Preface

Due to license policies of standard terminology providers we do not make available provider input files. Customers have to download these files from provider sites.

To load a standard terminology the customer has to copy the input files to a dedicated directory (called LD in the following) together with a specification json file (SF). In context of docker, kubernetes etc. a dedicated volume should be used. To avoid inconsistencies, services should not be accessed during load.

Terminology Directory (LD)

Directory LD contains all files required for a specific standard terminology. There should be a separate directory for each terminology (currently Snomed, Loinc, Mesh, and BfArM).

The deployment for Kubernetes provides an extra volume with the path

  • /etc/webcts2le/inst/terminologies

Subdirectories within this path should be used for LD, which is specified when starting the load (see REST interface for loading).

Standard Terminologies

Specification File (SF)

 1 {
 2     "terminologyDesignator": "<regex: 'mesh|loinc|snomed'>",
 3     // usual version string
 4     "version": "<string>",
 5     // group name (used in the navigator)
 6     "groupName": "<string>",
 7     // unique resource id in context of a cts2le instance
 8     "resourceId": "<string>",
 9     // input file paths relative to the given directory <LD>
10     "files": [
11         "<string>"
12         // , ...
13     ]
14 }

Supported Standard Terminologies

 1 {
 2   "terminologyDesignator": "snomed",
 3   "version": "20250515",
 4   "groupName": "snomed-ct",
 5   "resourceId": "Snomed-20250515",
 6   // following files usually are located at 'Snapshot/Terminology' within the snomed zip file
 7   // it is required that the order of the files is stated as below, i.e. 
 8   "defaultLanguage": "de",
 9   "files": [
10     "sct2_Description_Snapshot_GermanyEdition_20250515.txt",
11     "sct2_Concept_Snapshot_GermanyEdition_20250515.txt",
12     "sct2_Relationship_Snapshot_GermanyEdition_20250515.txt",
13     "sct2_StatedRelationship_Snapshot_GermanyEdition_20250515.txt",
14     "sct2_TextDefinition_Snapshot_GermanyEdition_20250515.txt"
15   ]
16 }
 1 {
 2     "terminologyDesignator": "loinc",
 3     "version": "2.80",
 4     "groupName": "loinc-tree",
 5     "resourceId": "Loinc-2.80",
 6     // defines the corr. display name
 7     "defaultLanguage" : "de",
 8     // the following linguistic variants must be exist at directory 'AccessoryFiles/LinguisticVariants',
 9     "linguisticVariants": [
10         {"lang": "de", "file": "Loinc_2.80/AccessoryFiles/LinguisticVariants/deDE15LinguisticVariant.csv"},
11         {"lang": "at", "file": "Loinc_2.80/AccessoryFiles/LinguisticVariants/deAT24LinguisticVariant.csv"}
12     ],
13     "files": [
14         // file 1, 2 must be the hierarchy csv and the table csv, respectively
15         // usually file 1 is at 'AccessoryFiles/MultiAxialHierarchy/' and file 2 at
16         // 'LoincTable/' in the providers zip file, e.g. 'loinc271.zip'
17         "Loinc_2.80/AccessoryFiles/ComponentHierarchyBySystem/ComponentHierarchyBySystem.csv",
18         "Loinc_2.80/LoincTable/Loinc.csv"
19     ]
20 }
1 {
2     "terminologyDesignator": "mesh",
3     "version": "2025",
4     "groupName": "mesh",
5     "resourceId": "Mesh2025",
6     "files": [
7         "desc2025.xml"
8     ]
9 }

BfArM Terminologies

BfArM (Bundesinstitut für Arzneimittel und Medizinprodukte) provides the standard terminologies for Germany. To load these standard terminologies the customer has to create a dedicated directory (called LD in the following) together with a specification json file (SF). In context of docker, kubernetes etc. a dedicated volume should be used.

Packages

The packages downloaded from bfarm must be located in a dedicated directory (e.g. bfarm). The following structure is an example for two packages (ICDGM, OPS).

bfarm
|_ packages
|  |_ bfarm.terminologien.icd10gm-2025.0.0.tar.gz
|  |  |_ package/CodeSystem-icd10gm-agelow-2025.json
|  |  |_ package/CodeSystem-icd10gm-agereject-2025.json
|  |  |_ package/package.json
|  |  |_ ...
|  |_ bfarm.terminologien.ops-2025.0.0.tar.gz
|  |  |_ ...
|_ fhir-packs.jsonc

Specification File (SF)

for BfArM terminologies, the following specification file is used.

1 {
2     "terminologyDesignator": "fhir-package",
3     "canonicalUrlRegex": "<regex>", // optional
4     "packageRegex": "<regex>"
5 }
  • terminologyDesignator

    • has to be set to fhir-package
  • canonicalUrlRegex

    • this filter loads only terminologies whose canonical URL (https://hl7.org/fhir/R4/datatypes.html#canonical) conforms to <regex>. E.g., regex .*(agerejec|agelow).* will only load the terminologies CodeSystem-icd10gm-agereject-2025.json and CodeSystem-icd10gm-agelow-2025.json because its canonical URLs are
      • https: //terminologien.bfarm.de/fhir/CodeSystem/icd10gm-agereject|2025 and
      • https: //terminologien.bfarm.de/fhir/CodeSystem/icd10gm-agelow|2025, respectively.
  • packageRegex

    • this filter loads only packages whose canonical package name conforms to <regex>. The canonical package name ist defined as the form <name>|<version> where name and version are the properties in the package definition file
      • bfarm/packages/bfarm.terminologien.icd10gm-2025.0.0.tar.gz/package/package.json (see section Packages above).
      • E.g., regex .*(icd10gm\\|2025|ops\\|2025).* will only load the ICD and OPS package.

Example

1 {
2     "terminologyDesignator": "fhir-package",
3     "canonicalUrlRegex": ".*(agerejec|exotic|einmalk).*",
4     "packageRegex": ".*(icd10gm\\|2025|ops\\|2025).*"
5 }

In this example only the terminologies for age rejection and the exotic one of the ICD as well as the one-time codes of the OPS package will be loaded.

REST interface

GET HOST/WebCts2LE/service/crud/bulk/load-std-terminology

Query Parameters

  • directory: direcory path (LD)
  • loadSpec: path to specification file SF (relative to LD)

Example

GET HOST/WebCts2LE/service/crud/bulk/load-std-terminology?directory=/etc/webcts2le/inst/terminologies/loinc&loadSpec=load-spec-2.80.json

Note

Afterwards an update of the suggester (used by the navigator) has to be performed:

GET HOST/WebCts2LE/service/manage/index/suggester/update

Notes

Removal

Removal time is nearly equal load time due to the RDF store. It is recommended that the terminology is not present in the store beforehand.

Loading Time

Due to used RDF quad store technology the loading time (i.e. 'weaving' the RDF-triple knowledge graph based on the flat files) on a notebook (~2 GHz throttled, 16 GB RAM) is:

  • Smomed: ~90 min
  • Loinc: ~45 min
  • MeSH: ~15 min

It is recommended in the context of docker, kubernetes, etc. to define high CPU/RAM resources for the fuseki container to decrease loading time.

It is a known restriction for quad stores that loading time is high compared to other nosql data stores. On the other hand it offers elaborated functionality based on semantic web technologies. Future versions of CTS2-LE could utilize nosql data stores.

Database Space

Quad stores usually occupies huge space on disk because every quad is indexed. E.g. Snomed has ~18.000.000 quads and the quad store fuseki can occupies ~50 GB after loading. But fuseki offers a compaction of the database to ~8 GB with the call

curl --request POST --url 'http://localhost:3030/$/compact/cts2le?deleteOld=true'
           <script async src="https://cdn.jsdelivr.net/npm/katex-copytex@latest/dist/katex-copytex.min.js"></script>