SMART SENSING insights
How much data does a digital pathologist actually generate?

Size Matters. — How much Data does a Digital Pathologist actually Generate? 

It’s 2025 and a growing number of pathologists is already convinced, that going digital is worth it. (If you’re still on the fence, read our Ultimate Guide: 10 Things You Should Know about the Transformational Impact of Digital Pathology). One important factor to consider is the cost of going digital – and of staying digital. And of course, in case of digital and computational pathology, cost is directly tied to data: namely, the expenses associated with digitizing and storing that data. Which is why the question “How much data does a digital pathologist actually generate?” might come up naturally.

Due to the need for great detail when examining the structure and composition of tissue in a Whole Slide Image (WSI), these slides reach enormous file sizes, even when compressed. The good thing is that, thanks to many pathology labs and clinics transitioning to digital in the last years, there is a wide variety of implementation reports available. Most of them also address the challenge of IT-infrastructure, storage solutions for large amounts of data, and associated costs. So, here’s an overview of the most important numbers, learnings, and outcomes.

First things first: Let’s talk numbers.

chart showing how much data is generated in digital pathology
© Fraunhofer IIS

Four essential takeaways

1. What determines storage requirements?

Data storage requirements in digital pathology

  • Size of tissue section (small vs. large biopsy specimens)
  • Scanning resolution
  • Image modality
  • Image quality settings
  • Number of slides per day (average)
  • How long digital data should be retained/archived
  • Scanning magnification
  • Multiple plane scanning

2. What are the available storage options?

Tier-System

A frequently used system is the multiple-tier system. In many cases this means using hardware on site in a server room with different storage capacities. However, there is the possibility of a mix of online storage options and local servers.
Granada University Hospitals for example, has Tier 1 with 20-TB online storage capacity for immediate fast access to image data sent from the scanners to the servers. Within 12 hours, WSIs are moved to tier 2 (360 TB) which is for images from approximately the previous 12 months. Images older than that go to tier 3, a nearline tape storage 1 PB capacity. (Retamero et al., 2019) [2].
UMC Utrecht employs a two-tier-system for storage: Tier 1 utilizes an EMC VNX5200 for local VM cluster storage and short-term digital slides, while Tier 2 is an EMC Isilon for hospital-wide image archiving. (Stathonikos et al., 2019) [1]

The cloud

Another long-term storage option which is mainly mentioned in the context of future directions, is cloud services. Web-based solutions have the advantage of being portable and easily accessible from any device as long as the user has an internet connection, and they are easily scalable to individual storage needs. In addition, there are no maintenance costs as with hardware. At the moment, however, cloud storage can be costly, especially for large data volumes, and many providers charge subscription fees. Therefore, in most of the experience reports of institutions, they preferred on-site storage, also for security compliance and cost efficiency. Another factor is that, down the line, it may be very difficult to migrate data to another cloud (“vendor lock-in”). In the future, cloud storage will hopefully become cheaper, if vendors lower egress costs and offer discounts for high-volume storage.

Important Consideration for the U.S.: The Health Insurance Portability and Accountability Act (HIPAA) mandates that backup and disaster recovery plans be in place for all medical images. Therefore, ensuring redundancy is crucial for effective disaster recovery, regardless of the storage method selected. And that’s actually something worth considering, no matter where you are.

3. So, what about the costs?

Costs depend on various factors, including storage capacity. But let’s look at it from a different angle: Can we save costs by going digital? Hanna et al. (2019) [3] conducted a cost savings analysis for a high-volume academic cancer center during the implementation of digital pathology from 2014 to 2018.
They reviewed the required personnel, hardware, software, service agreements, IT infrastructure, digital storage, glass slide physical asset storage, and off-site storage vendor services.
Total projected savings were $267,000 per year, equating to $1.3 million over five years (2019-2023). Savings stemmed from staff restructuring, reduced need for suppliers due to less glass slide transportation, and decreased physical storage requirements. The operational break-even point was estimated to be in Q1 2021, about seven years post-implementation.

4. Ways to keep WSI storage affordable

Naturally, a frequent question is how to keep storage costs manageable for large data volumes. Here are the most frequently used solutions:

  • Use multi-tier storage: current cases on hard disks (fast access) and archived data on slower, cheaper tapes.
  • Store only a portion of image data (one or a few slides per sample).
  • Use a rolling archive, where scans are deleted after some time. Interesting or valuable scans can be tagged (“archive”) to prevent deletion. Typical cycles are 6 weeks to 3 months.
  • Savings in full-time employees: At Memorial Sloan Kettering Cancer Center, digital pathology led to a 93% reduction in physical slide requests, allowing three employees to be reassigned to digital operations. (Ardon et al., 2023) [4]
  • Store and view WSIs in compressed formats. However, even after compressing scans to 500 MB, Lujan et al. (2021) [5] estimate that a non-redundant storage system for a lab with a throughput of 250,000 slides annually would still cost around $90,000 per year.
  • Research indicates that pathologists typically do not zoom in to the highest level. Variable resolution images (VRIs), smaller than original WSIs, are often sufficient for diagnosis. Ongoing studies aim to clarify resolution needs among pathologists, which could significantly reduce data storage requirements for high-resolution images.

So, does size matter?

With a better picture of data storage in digital pathology, one might actually come to the conclusion that size does matter indeed. In a digital pathology workflow, the question of storage should not be neglected. But fortunately, we’ve moved far beyond the days when you needed an extra bag for your bulky 250-megabit hard disk.  Today, we have access to cloud services and pocket-sized disks offering multiple terabytes of storage. While digitization and storage do come with a cost, the studies above demonstrate that going digital can lead to savings in areas connected to traditional workflows and, ultimately, a reduction in overall expenses over time.


References

[1] Stathonikos, N., Nguyen, T. Q., Spoto, C. P., Verdaasdonk, M. A. M. & Van Diest, P. J. (2019). Being fully digital: perspective of a Dutch academic pathology laboratory. Histopathology, 75(5), 621–635. https://doi.org/10.1111/his.13953

[2] Retamero, J. A., Aneiros-Fernandez, J. & Del Moral, R. G. (2019). Complete Digital Pathology for Routine Histopathology Diagnosis in a Multicenter Hospital Network. Archives Of Pathology & Laboratory Medicine, 144(2), 221–228. https://doi.org/10.5858/arpa.2018-0541-oa

[3] Hanna, M. G., Reuter, V. E., Samboy, J., England, C., Corsale, L., Fine, S. W., Agaram, N. P., Stamelos, E., Yagi, Y., Hameed, M., Klimstra, D. S. & Sirintrapun, S. J. (2019). Implementation of Digital Pathology Offers Clinical and Operational Increase in Efficiency and Cost Savings. Archives Of Pathology & Laboratory Medicine, 143(12), 1545–1555. https://doi.org/10.5858/arpa.2018-0514-OA

[4] Ardon, O., Klein, E., Manzo, A., Corsale, L., England, C., Mazzella, A., Geneslaw, L., Philip, J., Ntiamoah, P., Wright, J., Sirintrapun, S. J., Lin, O., Elenitoba-Johnson, K., Reuter, V. E., Hameed, M. R. & Hanna, M. G. (2023). Digital pathology operations at a tertiary cancer center: Infrastructure requirements and operational cost. Journal Of Pathology Informatics, 14, 100318. https://doi.org/10.1016/j.jpi.2023.100318

[5] Lujan, G., Quigley, J. C., Hartman, D., Parwani, A., Roehmholdt, B., Van Meter, B., Ardon, O., Hanna, M. G., Kelly, D., Sowards, C., Montalto, M., Bui, M., Zarella, M. D., LaRosa, V., Slootweg, G., Retamero, J. A., Lloyd, M. C., Madory, J. & Bowman, D. (2021). Dissecting the Business Case for Adoption and Implementation of Digital Pathology: A White Paper from the Digital Pathology Association. Journal Of Pathology Informatics, 12(1), 17. https://doi.org/10.4103/jpi.jpi_67_20


Image copyright (featured image): Adobe Stock – stock.adobe.com

Anna Chiwona

Anna is a working student at Fraunhofer IIS. She holds a bachelor's degree in media management and has experience in copy writing and social media. She contributes texts with a focus on affective computing and digital health.

Add comment

Get started now

Download MIKAIA® for free from www.mikaia.ai

Don’t miss any news

Sign up now for the MIKAIA® newsletter

Get in touch with us

Questions, remarks, feature
requests, project inquiries, …
email us: mikaia@iis.fraunhofer.de

Dr. Volker Bruns
Group Manager
Medical Image Analysis (MIA)
Digital Health and Analytics | Fraunhofer IIS

All Categories