Site Reliability Engineer Intern
Talos Trading
New York, New York, United States
June 2024 - Aug 2024
- Enhanced system performance and cost efficiency by optimizing GCP hardware provisioning, saving $100K monthly
- Created an interactive web tool to visualize GCP infrastructure and market data, improving operational transparency
- Improved monitoring and issue resolution by integrating Datadog with Flask, saving 2 hours daily in troubleshooting
- Streamlined order reconciliation workflows using Cloud Composer and Terraform, uncovering a potential $1.5M annual cost
- Automated failover processes and deployment checks, reducing downtime and deployment time significantly
- Reduced configuration errors by implementing YAML linting in GitHub Actions, improving code quality
- Conducted root cause analysis for cross-functional issues by tracing logs across platform components