Summary
Overview
Work History
Education
Skills
Timeline
Generic

Carlos Alejandro Perez Ceron

Senior Site Reliability Engineer (SRE) | Reliability, Cost & Cloud Governance
Bogota

Summary

Senior Site Reliability Engineer with 11 years of experience spanning IT operations, cloud engineering, and reliability leadership in enterprise and cloud-native environments. Specialized in designing reliable, scalable systems while controlling cloud costs through FinOps-aligned practices. Strong background in incident management, observability, Kubernetes platforms, automation, and cloud financial optimization across AWS, Azure, and GCP. Proven ability to reduce outages, stabilize platforms, and prevent cost overruns by integrating reliability engineering with cost governance.

Overview

11
11
years of professional experience

Work History

Head of IT & AI

WPP
04.2023 - Current
  • Defined reliability and operational governance strategies across multi-cloud environments, balancing availability, performance, and cost efficiency.
  • Led incident management processes and operational reviews, aligning system reliability with business impact and financial risk.
  • Integrated FinOps principles into reliability planning, ensuring uptime improvements did not introduce uncontrolled cloud spend.
  • Established observability and reporting frameworks to provide leadership visibility into system health, reliability trends, and cost drivers.
  • Coordinated cross-functional teams to align reliability goals, security controls, and budget constraints.

Information Technology Operations Lead

Teleperformance
03.2022 - 03.2023
  • Owned platform reliability and availability for critical IT services while managing cloud cost optimization initiatives.
  • Implemented operational governance practices aligned with SRE principles, including incident response, root cause analysis, and service stability reviews.
  • Balanced reliability requirements with budget constraints, preventing over-provisioning and unnecessary cloud spend.
  • Led infrastructure automation initiatives using Terraform and Ansible to improve consistency, reduce incidents, and optimize resource usage.
  • Collaborated with vendors and internal teams to improve service reliability and cost predictability.

Senior Operations Systems Engineer

Linio Group
07.2021 - 03.2022
  • Supported highly available e-commerce platforms running on Kubernetes and cloud-native architectures.
  • Contributed to reliability improvements through monitoring, incident response, and performance optimization initiatives.
  • Worked with platform teams to align Kubernetes scaling, autoscaling, and resource allocation with cost and reliability objectives.
  • Supported cloud cost optimization initiatives by identifying inefficiencies impacting both uptime and spend.

Operations Support System Engineer

Telefónica
11.2020 - 07.2021
  • Automated monitoring and operational workflows to reduce incident frequency and improve system stability.
  • Supported large-scale infrastructure environments with focus on reliability, observability, and operational efficiency.
  • Applied anomaly detection concepts to identify abnormal system behavior affecting performance and cost.
  • Participated in governance and compliance initiatives related to operational risk and service continuity.

Senior Systems Analyst

Cencosud S.A.
06.2019 - 10.2020
  • Designed and supported scalable cloud solutions with emphasis on availability, disaster recovery, and business continuity.
  • Contributed to reliability and capacity planning activities aligned with operational risk management.
  • Automated infrastructure workflows to reduce manual errors and improve platform stability.
  • Supported performance optimization initiatives across AWS and Azure environments.

Systems Analyst

Banco De Occidente
10.2018 - 06.2019
  • Supported critical banking systems with focus on uptime, security, and operational reliability.
  • Participated in risk and incident analysis with emphasis on minimizing service disruption.
  • Supported cloud adoption initiatives with reliability and continuity considerations

Earlier Career Finops & Systems Analyst

Colombian National Army · Rappi · Mercadería Justo
10.2014 - 10.2018
  • Built strong foundations in incident management, monitoring, infrastructure operations, and troubleshooting.
  • Supported mission-critical systems requiring high availability and rapid incident resolution.

Education

Linux Essentials

Cisco
United States

NDG Linux Unhatched

Cisco
United States

Systems Engineer

Corp Unificada Nacional De Educación Superior
Bogota
05.2001 -

FinOps Certified Engineer

FinOps Foundation
United States
05.2001 -

Cloud Computing

Google
United States

Microsoft Certified: Azure

Microsoft
United States

Site Reliability Engineering: Measuring And Manage

Google
United States

IBM Systems And Solutions Architect

IBM
United States

Skills

  • Cloud Cost Optimization (AWS, Azure, GCP)
  • FinOps Practices & Governance
  • Budget Forecasting & Variance Analysis
  • Cost Allocation & Tagging Strategies
  • Reserved Instances & Savings Plans
  • Cloud Cost Anomaly Detection
  • Multi-Cloud Cost Management
  • Unit Economics & Cost Modeling
  • Executive Cost Reporting
  • Stakeholder Management (Finance & Engineering)
  • Cloud Governance & Guardrails
  • Automation for Cost Visibility (Python)
  • Cloud Financial Risk Management
  • IT Investment Strategy

Timeline

Head of IT & AI

WPP
04.2023 - Current

Information Technology Operations Lead

Teleperformance
03.2022 - 03.2023

Senior Operations Systems Engineer

Linio Group
07.2021 - 03.2022

Operations Support System Engineer

Telefónica
11.2020 - 07.2021

Senior Systems Analyst

Cencosud S.A.
06.2019 - 10.2020

Systems Analyst

Banco De Occidente
10.2018 - 06.2019

Earlier Career Finops & Systems Analyst

Colombian National Army · Rappi · Mercadería Justo
10.2014 - 10.2018

Systems Engineer

Corp Unificada Nacional De Educación Superior
05.2001 -

FinOps Certified Engineer

FinOps Foundation
05.2001 -

Linux Essentials

Cisco

NDG Linux Unhatched

Cisco

Cloud Computing

Google

Microsoft Certified: Azure

Microsoft

Site Reliability Engineering: Measuring And Manage

Google

IBM Systems And Solutions Architect

IBM
Carlos Alejandro Perez CeronSenior Site Reliability Engineer (SRE) | Reliability, Cost & Cloud Governance