Pipeline 1

Welcome to My Portfolio

I'm Ehsan Soltanmohammadi, a Software Data Engineer with a deep passion for designing and implementing scalable big data pipelines. I specialize in optimizing ETL processes using Kubernetes, PySpark, and Apache Airflow, while leveraging cloud services like AWS and Azure. My expertise extends to integrating machine learning techniques, utilizing tools such as Scikit-learn, Pytorch, and TensorFlow to build predictive models and enhance data-driven insights in complex domains.

Sample ETL Project Explanation

In my portfolio, I've showcased a comprehensive ETL project where I developed a robust pipeline to handle real estate rental data. The ingestion layer utilized AWS Lambda functions, scheduled by AWS EventBridge, to periodically extract data from the Zillow API. This data, which includes rent prices for San Francisco (SF), San Diego (SD), and New York (NY), was initially stored in AWS Redshift. Following this, another Lambda function triggered data transformations as it arrived, cleaning and structuring the data before loading it into a PostgreSQL database. Finally, an interactive dashboard was created using Django, providing dynamic visualizations of the rental trends across the three cities.

Link to the Interactive Dashboard

Steps

View My Resume

Sample Data Pipelines

LinkedIn

Participate in My Research Survey

I am conducting a simple survey as part of our research on data engineering domains related to my graduate studies. Your participation will provide valuable insights into the current trends, challenges, and advancements in data engineering. The survey is designed to be brief and should only take a few minutes of your time.

Start the Survey

Total survey submissions so far: 5

Current time:

Contact Me