● Available for opportunities

Luthfi Raditya Meza

Data Engineer

Architecting scalable data platforms & lakehouse solutions on Cloud

3+ Years Experience

AWS Certified

15+ Projects

About

Data Engineer with 3+ years of experience specializing in cloud data engineering, lakehouse architectures, and enterprise-scale ETL/ELT pipelines.

Passionate about transforming complex data challenges into scalable, governed solutions that drive analytics and AI innovation.

Technologies

Python AWS Snowflake Databricks Airflow PostgreSQL Iceberg Docker

Current

Data Engineer

Insignia

Apr 2025 - Present • Jakarta, Indonesia

Leading end-to-end modernization of legacy big data ecosystem to AWS cloud platform. Architected data lakehouse using Apache Iceberg, engineered event-driven ETL/ELT pipelines, and established comprehensive data governance framework with Lake Formation.

Key Achievements

Spearheaded the end-to-end modernization of a legacy big data ecosystem to a scalable AWS cloud platform, enhancing data accessibility for enterprise-wide analytics and significantly reducing operational overhead
Led multiple projects related to the client Unified Data Platform, including data standardization, metadata centralization, cross-domain ingestion pipelines, and a unified schema framework for analytics and AI teams
Architected and deployed a data lakehouse solution using Apache Iceberg on S3, establishing a transactional, governed data foundation that supports both business intelligence and advanced analytics
Created a modern, scalable data lake specifically designed to support network anomaly detection, including structured log storage, standardized schemas, and optimized access patterns for machine learning workloads
Engineered robust, event-driven ETL/ELT pipelines using AWS Glue, Lambda, and Airflow to ingest and process complex data from diverse sources, including SQL, NoSQL, HDFS, and real-time network feeds
Delivered a governed, Medallion-architecture data repository that serves as the single source of truth for business intelligence (Quicksight), ad-hoc querying (Athena), and critical data science initiatives, including network anomaly detection and AI-powered chatbots
Established a robust data governance and operations framework using AWS Lake Formation and CloudWatch, delivering a centralized data catalog, data lineage, and fine-grained access controls for all enterprise data assets
Engineered a comprehensive ingestion framework using AWS Glue, DMS, and Kinesis to unify batch, streaming, and change-data-capture (CDC) from diverse enterprise sources, including Oracle, SQL Server, and SFTP
Established a mature CI/CD pipeline with GitHub Actions for automated Glue script deployments and implemented rigorous database performance tuning, which improved development velocity and reduced data processing times

AWS Glue S3 Athena Lake Formation Lambda Airflow Iceberg Python GitHub Actions Snowflake Databricks

ETL/ELT Lakehouse Architecture CDC Data Governance Orchestration Cloud Engineering

2023-2025

Data Engineer

NTX Solusi Teknologi

Dec 2023 - Mar 2025 • Jakarta, Indonesia

Architected ETL/CDC pipelines with MinIO and Prefect orchestration. Improved search relevance by 35% with vector search, launched production LLM solution reducing report creation time by 90%, and implemented data catalog solutions reducing retrieval times by 50%.

Key Achievements

Architected and optimized ETL/CDC pipelines and data lake implementation using MinIO, while orchestrating workflows with Prefect to ensure smooth and scalable data processes
Enhanced data reliability by integrating validation tools like Great Expectations, detecting and resolving 95% of data anomalies before ingestion
Designed and maintained normalized and denormalized data models in PostgreSQL and MongoDB to support analytical dashboards and machine learning workflows
Implemented a scalable data lake solution, ensuring efficient storage and retrieval of structured and unstructured data
Improved search relevance by 35% by developing and deploying a high-performance vector search system using HNSW
Achieved 95% accuracy in Named Entity Recognition (NER) models for automated report generation through MLOps practices, including experiment tracking, model registry, and monitoring
Launched a production-grade Large Language Model (LLM) solution that reduced client report creation time by 90%, enhancing operational efficiency
Boosted system reliability by implementing CI/CD pipelines and introducing monitoring solutions via Prometheus and Grafana, achieving 90% uptime
Established standardized data engineering workflows and documentation, enhancing team productivity by 40% and cutting project delays by 25%
Implemented data catalog solutions such as Amundsen and DataHub to enhance data discovery, reducing data retrieval times by 50% and increasing data compliance by 35%
Fostered a culture of continuous innovation by experimenting with emerging technologies, leading to a 20% annual improvement in operational efficiency and a 60% rise in tech adoption

Prefect PostgreSQL MongoDB MinIO Docker FastAPI MLFlow Airbyte Amundsen DataHub DSPy Langchain Great Expectations LakeFS Redis Elasticsearch

ETL Change Data Capture Data Orchestration Data Warehousing Data Governance AI MLOps LLM

AI Engineer

PERGA

Mar 2023 - Jun 2023 • Hamburg, Germany (Remote)

Developed NER and document classification solutions using LayoutLM achieving 92% accuracy. Configured Label Studio for data labeling, successfully labeled 200+ documents, and spearheaded creation of 3 main AI features for the application.

Key Achievements

Developed solutions for Named Entity Recognition (NER) and document classification using the concepts from the LayoutLM paper, achieving an impressive accuracy rate of 92% at the first checkpoint
Devised and implemented a rule-based approach for Named Entity Recognition, enhancing the precision and efficiency of entity recognition tasks
Conducted extensive research on document-based AI models for 2 AI models on apps, generating valuable insights that contributed to the company's advancements in this domain
Configured Label Studio to streamline data labeling processes and successfully labeled 200+ company document data
Spearheaded the creation of 3 main AI features for the application, significantly elevating its overall functionality
Collaborated with universities on the preparation of review papers, fostering productive partnerships and knowledge exchange in model development. Engaged in 2 model feature discussions with universities

Python PyTorch Label Studio LayoutLM

Named Entity Recognition NLP Document Classification Data Labeling

Internship

Data Science & Analytics Intern

SG-EDTS

Jan 2022 - Apr 2022 • Jakarta, Indonesia

Completed 4 data science & analytics courses with hands-on experience. Designed e-commerce recommendation system using SVD++ achieving MSE 0.844 and MAE 0.384. Constructed 3 business performance dashboards using Tableau.

Key Achievements

Completed 4 data science & analytics courses and gained hands-on experience in production environments
Researched state-of-the-art machine learning algorithms from research papers and implemented them in practical applications
Designed, developed, and implemented a client (e-commerce) recommendation system model using the SVD++ model, achieving an MSE score of 0.844 and an MAE score of 0.384
Constructed and fine-tuned 3 business performance dashboards with scope analysis using Tableau, providing actionable insights for stakeholders

Python Tableau BigQuery Excel MLFlow

Recommendation Systems Machine Learning Data Visualization EDA

Education

Information Systems

Universitas Airlangga

GPA 3.53/4.00

2019 - 2023

Achievement

AWS Certified Data Engineer

Amazon Web Services

Valid until Aug 2028

Connect

Let's Work Together

Interested in collaborating or discussing data engineering opportunities?

Get in Touch