Client Background
A Big Data Platform that centralizes different sources of data from payers, providers, clinicians etc.
Client Need
Acquired more than 20 companies over the last decade. Multiple products with same use case running in silo environment and data being replicated in multiple location
Required a centralized data storage required for data consumption/analysis
Accelerate business growth by helping build new products, insights and enable AI and Machine Learning capabilities
Solution
Designed and Developed Data Lake Platform using AWS S3 and Apache Spark to ingest and process millions of transactions from various data types like Claims, Payments, Eligibility, Clinical, Imaging, etc…
Developed Pipeline Development Kit to create data pipelines with ease to quickly onboard tenant into the platform.
Created Orchestration Development Kit using Apache Airflow for scheduling AWS EMR data pipelines
Developed generic data pipelines for extracting and storing data that can be used by end users to search and retrieve their respective healthcare transactions using Elastic Search
Realized Benefits
Diverse and Ubiquitous Data amounting to 4 Peta Bytesof Cross Enterprise Financial, Operational, Clinical
Cost Savings of $400K annually by opting to utilize S3 intelligent storage tier options and creating object’s life cycle rules
Build Once – Use Multiple Framework – Operational Efficiency, Rapid Dev
Faster On-Boarding – Reduced time to market, New Growth Opportunities
Foundation for Integrated Products, Linked Cross Functional Data and enablement of Machine Learning and AI
Authoritative source of Large Data Sets
Tools & Technologies
AWS
Amazon S3 Bucket
Spark
Apache Airflow
Kafka
Elasticsearch
PostgreSQL
Dev Ops
GotLab
HashiCorp Terraform
Docker
Trending Success Stories
Ready to Innovate with Us?
Let’s Talk!
Connect with us on social media
Write to us at
[email protected]