MLOps Manager of Software Engineering at JP Morgan

Applications for this job have closed. This page will redirect to the JP Morgan jobs page in 10 seconds.

Home

MLOps Manager of Software Engineering

Greater London

Full time

Posted 6 months ago

JP Morgan

Banking, investment & finance

10,001+ employees

485 jobs

Compare top employers

The Aumni team at JPMorgan Chase is looking for a Software Engineering Manager to oversee a traditional SRE team and a MLOps [QU1]team to manage our core application, model hosting, deployment, and monitoring infrastructure in AWS.

A Software Engineering Manager within the Digital Private Markets department will help us manage multiple SRE teams with a joint focus on traditional web applications as well as AI/ML models. You will solve complex and broad business problems with clear communication, practical solutions, and stakeholder engagement. Through effective mentorship, management and system design, you will serve as a key enablement pillar for our software engineering and data science teams.

You will apply your extensive experience as a leader by sharing your knowledge of end-to-end operations, availability, reliability, and scalability in the AI/ML space. You also will serve as a mentor to your engineers as they enable the downstream Data Science and ML Engineering teams as they execute on our product roadmap. A focus on empathy, organization, and communication is key to success in this role.

Job responsibilities

Manages multiple teams responsible for core infrastructure to support AI/ML and web application initiatives
Oversees automated continuous integration and continuous delivery pipelines for the Software Development and Data Science teams to host web applications and develop AI/ML models
Mentors traditional SREs and MLOps engineers
Sets standards for Infrastructure, CI/CD and observability architecture
Fosters technical discussions with developers, key stakeholders, and team members to resolve complex technical problems
Builds technical roadmaps in collaboration with senior leadership and identifies risks or design optimizations
Proactively resolves issues before they impact internal and external stakeholders of deployed models
Champions the adoption of traditional SRE and MLOps best-practices within your teams

Required qualifications, capabilities, and skills

Formal training or certification on site reliability engineering concepts and/or 5+ years applied experience
2+ years of Engineering Manager or Tech Lead experience in the SRE or MLOps domain
Experience leading agile sprint ceremonies
Proven ability to lead, inspire, and manage a diverse team of software engineers
Strong mentoring and coaching skills
Excellent verbal and written communication skills, with the ability to effectively convey complex technical concepts to various audiences
Ability to work with a geographically distributed team across multiple timezones
Ability to manage multiple projects and priorities effectively
Can articulate the importance of monitoring and observability in the AI/ML space. Enforces its implementation & use across an organization
Domain knowledge of machine learning applications and technical processes within the AWS ecosystem.
Expertise with Terraform, Kubernetes (or other container orchestration platforms), and CI/CD platforms such as Jenkins or Github Actions
Experience with event-driven, microservice oriented architectures, specifically with AWS Lambda
Understanding of the different roles served by data engineers, data scientists, machine learning engineers, and system architects, and how MLOps contributes to each of these workstreams

Preferred qualifications, capabilities, and skills

Experience managing multiple teams with ambiguity and external dependencies
Comfortable with team management, fostering collaboration, promoting design patterns, and presenting technical concepts to non-technical audiences
Ability to break down large concepts and goals into smaller requirements and manage multiple competing priorities
Understanding of ML model training and deployment procedures and techniques
Experience with data engineering and CI/CD best practices
Familiarity with observability concepts and telemetry collection using tools such as Datadog, Grafana, Prometheus, Splunk, and others
Experience working with ML engineering platforms such as Databricks and Sagemaker
Experience working with Data Engineering technologies such as Snowflake and Airflow

usually capitalized as MLOps[QU1]

View all jobs from JP Morgan