Get A Quote


+91 1234 44 4444


  • Cloud Strategy to Minimize Processing Time

    Implemented an automated solution for resource configuration, deployment, and scheduling


    • Needed consultation for evaluation of tools and approaches for cloud adaptation. The objective was to offload computing from existing out-moded on-premise MapR cluster to the cloud.
    • Needed a solution custom-built for their live data (largest module ) for evaluation and decision-making.?
    • Needed an automated solution for resource configuration, deployment, scheduling,? scalability, etc.
    • Needed the ability to process incoming incremental data (10 TB or more) in a better and more efficient manner.


    • Provided a cloud-optimized, on-demand spin up solution for the computation offloading and Snowflake-based reporting solution.
    • Weekly extraction of 5TB or more data performed from the on premise MapR cluster and placed in S3 using shell script & AWS CLI executed by Airflow jobs.
    • Based on data size, copied over AWS EMR cluster is spun up using cloud formation templates and AWS CLI for executing Spark & Pig scripts.
    • Resultant data post-processing from EMR is pushed into S3 buckets for persistence.
    • AWS EMR cluster is auto-scaling enabled and gets purged post-processing.

    Tools & Technologies

    Amazon S3, Apache Pig, Apache Spark, Cloud Formation, Amazon EMR, MAPR, Apache Airflow, Python, R, Powershell, Snowflake, Bash

    Key benefits

    • Provided a cost-efficient – On-demand solution for computation on AWS platform
    • Added value by providing best-suited recommendations for resource type and configuration for a cost-efficient and optimal solution.
    • Offloaded jobs that would need 48 hours in on perm server to cloud and processed them within 24 hours.
  • 智胜彩票