Summary
Overview
Work History
Education
Skills
Certification
Timeline
Generic

Venkata Bora

Dallas

Summary

Extensive experience in data analytics and ETL migration projects to Google Cloud Platform (GCP) using tools like BigQuery, Cloud DataProc, Cloud Storage, and Composer. Proficient in data modeling concepts (Star and Snowflake schemas), SQL (Presto, Hive) and programming with Python and PySpark. Skilled in building robust airflow data pipelines using bash scripting on Unix/Linux systems and developing python Packages for ETL processes. Hands-on experience with Sqoop for transferring data between RDBMS, HDFS, and Hive, and working with file formats like Avro, ORC, and Parquet. Expertise in Spark-SQL, Pyspark for data transformations and Spark Streaming for real-time processing. Strong skills in data preparation, modeling, and visualization using Power BI and Tableau to create impactful dashboards and reports. Experienced in all phases of the SDLC, including analysis, design, development, testing, and deployment.

Overview

12
12
years of professional experience
1
1
Certification

Work History

Data Engineer

Walmart
11.2023 - Current
  • Optimized data processing by implementing efficient ETL pipelines and streamlining database design
  • Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability
  • Fine-tuned query performance and optimized database structures for faster, more accurate data retrieval and reporting
  • Designed scalable and maintainable data models to support business intelligence initiatives and reporting needs
  • Automated routine tasks using Python, Pyspark, Airflow and GCP scripts, increasing team productivity and reducing manual errors

Big Data Engineer

Revenue Analytics
07.2022 - 10.2023
  • Designed and implemented ETL pipelines using AWS services such as AWS Glue and Amazon EMR to process and transform large-scale data sets
  • Collaborated with cross-functional teams to define data transformation logic, data integration patterns, and schema evolution strategies
  • Orchestrated Spark jobs using Apache Airflow, creating a scalable and automated ETL workflow for timely data processing

Sr Data Engineer- SupplyChain

Walmart
02.2021 - 06.2022
  • Build data pipelines in airflow in GCP for ETL related jobs using different airflow operators
  • Experience in GCP Dataproc, GCS, Cloud functions, BigQuery
  • Designed and Co-ordinated with the Data Science team in implementing Advanced Analytical Models in Hadoop Cluster over large Datasets
  • Designed and developed ETL Pipeline to extract Shipping Expeditor's API Data into the consumption layer using python
  • Created data pipelines and scheduled using airflow to improve the data's reliability and quality while making sure to adhere Data Governance Policies
  • Lead the efforts to conduct internal training sessions on GCP/Bigquery within the team

Data Engineer 3 - HRDatalake

Walmart
11.2018 - 01.2021
  • SME, Designed and built pipelines to extract and process source files from workday and stored the data on HR data lake using Hive
  • Supported HR Business Units by providing data ready to use for data analysts, data scientists for business insights, machine learning etc
  • Developed ETL pipeline to extract and load data from ranger into sql server consumption layer using python
  • Professional experience in building ETL pipelines to extract data using Spark

Sr BigData Consultant - Finance

Walmart
09.2017 - 10.2018
  • Designed ETL Process from different sources into HDFS/Hive/Teradata using internal aorta(sqoop) framework
  • Build ETL Pipelines to extract data from multiple sources/Pos files into Finance Datalake and build a unified consumption layer for business analytics

Application Developer

Willis Towers Watson
03.2016 - 08.2017
  • Played an integral role in developing Client Proposal Integrator application to validate data from multiple systems, and authored Spark SQL scripts based on functional specifications
  • Seamlessly imported and exported data into HDFS, HIVE using Sqoop
  • Actively participated in creating Hive tables and writing multiple Hive queries to invoke and run MapReduce jobs in the backend

Associate Software Engineer

Beta Monks Technologies
11.2012 - 12.2014
  • Contributed to the development of Prepaid Card system for National Banks by methodically creating database objects such as tables, stored procedures etc
  • Identified, examined, and resolved issues by modifying backend code as needed, and performed unit testing of core system and business functionality to ensure adherence to quality criteria

Education

Master of Science - Information Systems

University of Maryland Baltimore County
Baltimore, MD
12.2016

Bachelor of Science - Computer Science Engineering

Andhra University
03.2012

Skills

  • Big Data Technologies: Hadoop HDFS Kafka Hive Sqoop Automic Yarn
  • Cloud Platforms & Services: GCP GCP Cloud Storage BigQuery Cloud Dataproc AWS Athena Redshift Batch Step Functions Glue Crawler EC2 EMR
  • Programming & Scripting: Python SQL Shell Scripting Scala PySpark PL/SQL Spark SQL
  • Databases: Teradata MySQL DB2 MS SQL Server

Certification

  • GCP Certified Professional Data Engineer, Google, 2022
  • DataBricks Spark Certified, Databricks, 2021

Timeline

Data Engineer

Walmart
11.2023 - Current

Big Data Engineer

Revenue Analytics
07.2022 - 10.2023

Sr Data Engineer- SupplyChain

Walmart
02.2021 - 06.2022

Data Engineer 3 - HRDatalake

Walmart
11.2018 - 01.2021

Sr BigData Consultant - Finance

Walmart
09.2017 - 10.2018

Application Developer

Willis Towers Watson
03.2016 - 08.2017

Associate Software Engineer

Beta Monks Technologies
11.2012 - 12.2014

Bachelor of Science - Computer Science Engineering

Andhra University

Master of Science - Information Systems

University of Maryland Baltimore County
Venkata Bora