Raw data to meaningful metrics

Utilizing modern ETL methodologies to integrate data from multiple sources while maximizing throughput and integrity. Automate time consuming and manual reporting with built-in analytical tools to monitor key data points throughout the process.

Experience

Processing millions of records per day provided by multiple partners while normalizing data and applying client’s business logic before lastly utilizing the cleansed data in various public facing services and revenue streams.

Technologies utilized include CI/CD pipelines integrating automatic builds and deployments of Docker images onto physical hardware running Mesosphere DC/OS.

The applications running inside docker containers utilize parallel processing to further enhance record throughput and handle millions of messages passed via RabbitMQ to control data flow.

Cleansed data is compiled and stored in MongoDB staging databases upon which analytics are processed and collected before being displayed in various intra-business dashboards for key stakeholders and teams. The data is finally uploaded to AWS for various business requirements.

Technologies