CTS2-LE Supported Terminologies: Unterschied zwischen den Versionen

Aus CTS2-LE
Zur Navigation springen Zur Suche springen
Zeile 48: Zeile 48:
 
GET <HOST>/WebCts2LE/service/crud/bulk/load-std-terminology?directory=terminologies/loinc&loadSpec=load-spec-2.71.json
 
GET <HOST>/WebCts2LE/service/crud/bulk/load-std-terminology?directory=terminologies/loinc&loadSpec=load-spec-2.71.json
 
</pre>
 
</pre>
<h2 id="supported-terminologies">Supported Terminologies</h2>
+
<h1 id="supported-terminologies">Supported Terminologies</h1>
 
<pre>
 
<pre>
 
{
 
{
Zeile 126: Zeile 126:
 
}
 
}
 
</pre>
 
</pre>
<h2 id="notes">Notes</h2>
+
<h1 id="notes">Notes</h1>
 
<h2 id="removal">Removal</h2>
 
<h2 id="removal">Removal</h2>
 
<p>Removal time is nearly equal load time. It is recommended that the terminology is not contained in the store.</p>
 
<p>Removal time is nearly equal load time. It is recommended that the terminology is not contained in the store.</p>

Version vom 18. Januar 2022, 20:30 Uhr

Loading Standard Terminologies

Due to license policies of standard terminology providers we do not make available provider input files. Customers have to download these files from provider sites.

To load these standard terminologies the customer has to copy the input files to a dedicated directory (called LD in the following) together with a specification json file (SF). In context of docker, kubernetes etc. a dedicated volume should be used.

Currently, the following constraints have to be fulfilled:

  • the cts2le instance
    • should not contain the terminology to load
    • services cannot be accessed during load

Specification File (SF)

{
    "terminologyDesignator": "<regex: 'icd-alpha|ops|hl7Fhir|mesh|ucum|loinc|snomed'>",
    // usual version string
    "version": "<string>",
    // group name (used in the navigator)
    "groupName": "<string>",
    // unique resource id in context of a cts2le instance
    "resourceId": "<string>",
    // input file paths relative to the given directory <LD>
    "files": [
        "<string>"
        // , ...
    ]
}

!!! In case of designator hl7Fhir the version, groupName and resourceId property is defined by the standard itself and defining these properties has no effect.

REST interface

GET <HOST>/WebCts2LE/service/crud/bulk/load-std-terminology

Query Parameters

  • directory: direcory path (LD)
  • loadSpec: path to specification file SF (relative to LD)

Note

Afterwards an update of the suggester (used by the navigator) has to be performed:

GET <HOST>/WebCts2LE/service/manage/index/suggester/update

Example

GET <HOST>/WebCts2LE/service/crud/bulk/load-std-terminology?directory=terminologies/loinc&loadSpec=load-spec-2.71.json

Supported Terminologies

{
    "terminologyDesignator": "icd-alpha",
    "version": "2021",
    "groupName": "ICD",
    "resourceId": "Icd-2021",
    "files": [
        // file 1, 2 must be the icd xml and the alphaid txt, respectively
        "icd10gm2021syst_claml_20200918_20201111.xml",
        "icd10gm2021_alphaid_edvtxt_20201002.txt"
    ]
}
{
    "terminologyDesignator": "ops",
    "version": "2021",
    "groupName": "OPS",
    "resourceId": "Ops-2021",
    "files": [
        "ops2021syst_claml_20201016.xml"
    ]
}
{
    "terminologyDesignator": "ucum",
    "version": "2017",
    "groupName": "UCUM",
    "resourceId": "Ucum-2017",
    "files": [
        "ucum.tsv"
    ]
}
{
    "terminologyDesignator": "loinc",
    "version": "2.71",
    "groupName": "LOINC",
    "resourceId": "Loinc-2.71",
    "files": [
        // file 1, 2 must be the hierarchy csv and the table csv, respectively
        // usually file 1 is at 'AccessoryFiles/MultiAxialHierarchy/' and file 2 at
        // 'LoincTable/'
        // in the providers zip file, e.g. 'loinc271.zip'
        "MultiAxialHierarchy.csv",
        "Loinc.csv"
    ]
}
{
    "terminologyDesignator": "hl7Fhir",
    // version, groupName, and resource id is automatically generated !
    "files": [
        "valuesets.xml",
        "v3-codesystems.xml"
    ]
}
{
    "terminologyDesignator": "snomed",
    "version": "2021-07",
    "groupName": "SNOMED",
    "resourceId": "Snomed-2021-07",
    // following files usually are located at 'Snapshot/Terminology' within the snomed zip file
    "files": [
        "sct2_Description_Snapshot-en_INT_20210731.txt",
        "sct2_Concept_Snapshot_INT_20210731.txt",
        "sct2_Relationship_Snapshot_INT_20210731.txt",
        "sct2_StatedRelationship_Snapshot_INT_20210731.txt",
        "sct2_TextDefinition_Snapshot-en_INT_20210731.txt"
    ]
}

Notes

Removal

Removal time is nearly equal load time. It is recommended that the terminology is not contained in the store.

Loading Time

Due to used RDF quad store technology the loading time (i.e. 'weaving' the RDF-triple knowledge graph based on the flat files) on a notebook (~2 GHz throttled, 16 GB RAM) is:

  • Smomed: ~90 min
  • Loinc: ~45 min
  • icd/alphaid: ~20 min
  • MeSH: ~15 min
  • FHIR: ~25 min (~1500 single CS/VS files)
  • ops: ~8 min
  • ucum: <1 min

It is recommended in the context of docker, kubernetes, etc. to define high CPU/RAM resources for the fuseki container to decrease loading time.

It is a known restriction for quad stores that loading time is high compared to other nosql data stores. On the other hand it offers elaborated functionality based on semantic web technologies. Future versions of CTS2-LE could utilize nosql data stores.

Database Space

Quad stores usually occupies huge space on disk because every quad is indexed. E.g. Snomed has ~18.000.000 quads and the quad store fuseki occupies ~50 GB directly after loading. But fuseki offers a compaction of the database to ~8 GB with the call

java -cp fuseki-server.jar tdb2.tdbcompact [--help] --loc=<DB>

where DB is the disk location of the database, in docker or kubernetes context the path /etc/fuseki/apache-jena-fuseki-3.8.0/run/databases/cts2le (for details see <a href="https://jena.apache.org/documentation/tdb2/tdb2_cmds.html">https://jena.apache.org/documentation/tdb2/tdb2_cmds.html</a>, keyword tdb2.tdbcompact). !!! Note that the fuseki container must not be running.

Future version of CTS2-LE will provide compaction via a REST call.