Home / Jobs / Site Reliability Engineer - Ford Motor Company

Site Reliability Engineer at Ford Motor Company

Apply for the Site Reliability Engineer position at Ford Motor Company in . Find the best jobs for you effortlessly with InJob.AI, your ultimate solution for job search. Discover top job opportunities and streamline your job search process.

Job Description

<div>
 <strong>
  Job Description
  <br/>
  <br/>
 </strong>
 We are the movers of the world and the makers of the future. We get up every day, roll up our sleeves and build a better world -- together. At Ford, we&rsquo;re all a part of something bigger than ourselves. Are you ready to change the way the world moves?
 <br/>
 <br/>
 <strong>
  Enterprise Technology
 </strong>
 plays a critical part in shaping the future of mobility. If you&rsquo;re looking for the chance to leverage advanced technology to redefine the transportation landscape, enhance the customer experience and improve people&rsquo;s lives, this is the opportunity for you. Join us and challenge your IT expertise and analytical skills to help create vehicles that are as smart as you are.
 <br/>
 <br/>
 The Monitoring as a Service (MaaS) Team is building and evolving their services with customers in mind. MaaS will enable teams to modernize and disrupt by providing robust monitoring tools powered by AI and easy-to-use dashboards. Monitoring increases transparency of applications' performance end-to-end, regardless of hosting location (on-prem or in the cloud), which means a better view into how we can proactively manage our apps and improve performance.
 <br/>
 <br/>
 <strong>
  SOUTHEAST MI RESIDENTS:
 </strong>
 Please note, this job is posted as remote unless the selected candidate lives within 50 miles of Dearborn, MI, then if may require hybrid onsite schedule, up to 60% of the time.
 <br/>
 <br/>
 <strong>
  In this position...
  <br/>
  <br/>
 </strong>
 We are seeking an experienced Site Reliability Engineer (SRE) to join our team and lead the development, enhancement, and extension of our global monitoring and observability platform. As SRE your role will combine software engineering and systems engineering disciplines to ensure that software systems are available, scalable, and maintainable This individual will play a pivotal role in shaping the evolving needs of our customers including development of Service Level Indicators and Objectives (SLI/SLO), best practices with associated templates, as well as automation to remove toil and facilitate adoption.
 <br/>
 <br/>
 <strong>
  Responsibilities
  <br/>
  <br/>
 </strong>
 <strong>
  What you'll do...
  <br/>
  <br/>
 </strong>
 <ul>
  <li>
   Strong background in software development and systems administration, as well as excellent problem-solving, troubleshooting, and communication skills.
  </li>
  <li>
   Leverage experience to safely perform destructive testing to seek and discover vulnerabilities.
  </li>
  <li>
   Architect, design and develop automation to improve resilience, recoverability, availability, and scalability of supported applications.
  </li>
  <li>
   Recognize, validate, and evangelize emerging technologies and architectures that align with business objectives.
  </li>
  <li>
   Develop tooling to improve reliability, quality, and time-to-market for software solutions.
  </li>
  <li>
   Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve.
  </li>
  <li>
   Identify and reduce or eliminate toil via automation to maximize the time spent on engineering and innovation.
  </li>
  <li>
   Collaborate with development teams to design, build, and operate scalable and resilient software systems using Cloud native principles.
  </li>
  <li>
   Proactively identify stability risks and work with engineering leadership to establish appropriate mitigation plans.
  </li>
  <li>
   Regularly review key technical metrics such as transactions errors, logging, response times, caching strategies, conversion/bounce rates, capacity, and resource utilization
  </li>
  <li>
   Establish error budgets by identifying the right SLOs, SLIs, and effectively drive their use to ensure maximum availability/uptime.
  </li>
  <li>
   Conduct performance analysis and optimization of new and in-production systems
  </li>
  <li>
   Provide technical guidance and mentorship to other team members.
  </li>
  <li>
   Participate in incident response, support, recovery, and postmortem analysis.
   <br/>
   <br/>
   <br/>
  </li>
 </ul>
 <strong>
  Qualifications
  <br/>
  <br/>
 </strong>
 <strong>
  You'll have...
  <br/>
  <br/>
 </strong>
 <ul>
  <li>
   Bachelor&rsquo;s degree in Computer Science, Computer Engineering, Systems Engineering or related field or a combination of education and equivalent work experience
  </li>
  <li>
   5+ years of programming experience with one or more of these languages: Python, Go, Java/Scala, C or C++
  </li>
  <li>
   5+ years of experience solving complex architecture/design &amp; business problems, working to simplify, optimize, remove bottlenecks, etc.
  </li>
  <li>
   3+ years of experience as a Site Reliability Engineer with APM or other monitoring tools such as Dynatrace, New Relic, ELK, Splunk, Prometheus, Sensu, Nagios, Kafka, DataDog
  </li>
  <li>
   3+ years of experience with J2EE, NoSQL/SQL Datastore, Spring Boot, GCP/AWS/Azure &amp; Docker/K8 in developing multi-tier applications.
  </li>
  <li>
   3+ years of experience with Cloud (Google preferred)
  </li>
  <li>
   3+ years of experience with automated test-driven development in CI/CD Pipelines
  </li>
  <li>
   3+ years of experience with RESTful APIs and microservices platforms
   <br/>
   <br/>
   <br/>
  </li>
 </ul>
 <strong>
  Even better, you may have...
  <br/>
  <br/>
 </strong>
 <ul>
  <li>
   Master&rsquo;s Degree in Computer Science, Computer Engineering, Systems Engineering or related field
  </li>
  <li>
   Thorough understanding of software development and agile programming
  </li>
  <li>
   Understanding and ability to implement effective observability strategies to improve MTTD/R
  </li>
  <li>
   Working knowledge of the TCP/IP stack, internet routing and load balancing
   <br/>
   <br/>
   <br/>
  </li>
 </ul>
 You may not check every box, or your experience may look a little different from what we've outlined, but if you think you can bring value to Ford Motor Company, we encourage you to apply!
 <br/>
 <br/>
 As an established global company, we offer the benefit of choice. You can choose what your Ford future will look like: will your story span the globe, or keep you close to home? Will your career be a deep dive into what you love, or a series of new teams and new skills? Will you be a leader, a changemaker, a technical expert, a culture builder&hellip;or all the above? No matter what you choose, we offer a work life that works for you, including:
 <br/>
 <br/>
 <ul>
  <li>
   Immediate medical, dental, and prescription drug coverage
  </li>
  <li>
   Flexible family care, parental leave, new parent ramp-up programs, subsidized back-up childcare and more
  </li>
  <li>
   Vehicle discount program for employees and family members, and management leases
  </li>
  <li>
   Tuition assistance
  </li>
  <li>
   Established and active employee resource groups
  </li>
  <li>
   Paid time off for individual and team community service
  </li>
  <li>
   A generous schedule of paid holidays, including the week between Christmas and New Year&rsquo;s Day
  </li>
  <li>
   Paid time off and the option to purchase additional vacation time.
   <br/>
   <br/>
   <br/>
  </li>
 </ul>
 For a detailed look at our benefits, click here:
 <br/>
 <br/>
 https://corporate.ford.com/content/dam/corporate/us/en-us/documents/careers/2024-benefits-and-comp-GSR-sal-plan-2.pdf
 <br/>
 <br/>
 This position is a range of salary grades
 <strong>
  6-8
 </strong>
 .
 <br/>
 <br/>
 Visa sponsorship is not available for this position.
 <br/>
 <br/>
 Candidates for positions with Ford Motor Company must be legally authorized to work in the United States. Verification of employment eligibility will be required at the time of hire.
 <br/>
 <br/>
 We are an Equal Opportunity Employer committed to a culturally diverse workforce. All qualified applicants will receive consideration for employment without regard to race, religion, color, age, sex, national origin, sexual orientation, gender identity, disability status or protected veteran status. In the United States, if you need a reasonable accommodation for the online application process due to a disability, please call 1-888-336-0660.
 <br/>
 <br/>
</div>

AI Powered Job Insights

Are you ready to dive into the future of mobility? Ford Motor Company is seeking a Site Reliability Engineer to join their Monitoring as a Service (MaaS) team! This role combines software and systems engineering to enhance their global monitoring and observability platform, driving innovation and reliability.

📍 Location: Southeast MI residents preferred (remote options available)
💼 Position: Site Reliability Engineer
⏰ Type: Full-time
📅 Date Posted: July 17, 2024

Role Summary:
- Develop and enhance monitoring tools powered by AI for better application transparency.
- Improve the resilience, recoverability, and scalability of applications through automation and innovative solutions.
- Collaborate with development teams to operate scalable software systems.

What You'll Do:
- Architect and design automation to boost application performance.
- Identify stability risks and work on mitigation plans.
- Conduct performance analysis, focusing on optimizing current systems.
- Measure and optimize system performance while defining Service Level Indicators (SLI) and Objectives (SLO).
- Mentor team members and participate in incident responses and postmortems.

What’s Needed:
- Bachelor's degree in a relevant field or equivalent experience.
- Minimum 5 years of programming experience in languages such as Python, Go, or Java.
- Extensive experience with monitoring tools and practices, particularly Dynatrace, New Relic, or similar.
- Proven expertise with cloud platforms (preferably Google Cloud), CI/CD pipelines, and microservices architecture.

If you have a passion for innovative technology and a drive to improve operational reliability, don't miss this opportunity at Ford! They value diversity and encourage all qualified applicants to apply.

Top Interview Questions

Q: Can you explain your approach to implementing Service Level Indicators (SLIs) and Service Level Objectives (SLOs) in a monitoring system?A: My approach to implementing SLIs and SLOs starts with understanding the business needs and user expectations. I collaborate with stakeholders to identify key metrics that define success, such as uptime and response times. Then, I establish SLIs that align with these metrics, ensuring they are measurable and meaningful. Afterward, I set realistic SLOs based on historical performance data and team capabilities. Finally, I ensure these are continuously monitored using tools like Prometheus or DataDog, and regularly review them to adapt to changing requirements.
Q: Describe a situation where you had to automate a repetitive process. What tools did you use, and what was the outcome?A: In my previous role, I identified a repetitive deployment process that required manual intervention which created bottlenecks. I automated this process using Jenkins for CI/CD and Docker for containerization. By scripting the deployment pipeline and integrating automated testing, we reduced deployment times from several hours to just minutes, significantly increasing team efficiency and allowing more frequent releases.
Q: How do you ensure system performance and availability in a cloud-native environment?A: To ensure performance and availability, I employ a combination of proactive monitoring, automated scaling, and regular load testing. I utilize tools like Grafana and ELK Stack for real-time monitoring of application performance and infrastructure health. Implementing auto-scaling groups in AWS or GCP allows resources to adjust according to demand. Additionally, I conduct regular load tests using tools like JMeter to identify performance bottlenecks before they impact users.
Q: What strategies do you use to reduce 'toil' in your SRE practices?A: To reduce 'toil', I prioritize automating manual tasks, creating self-service tools for development teams, and implementing infrastructure as code (IaC) practices with Terraform or CloudFormation. By continuously reviewing processes to identify repetitive tasks, I can create scripts and workflows that automate these efforts. I also encourage a culture where team members document their knowledge and automation techniques, aiding future efficiencies and innovations.
Q: Can you detail an incident where you identified a stability risk and the steps you took to mitigate it?A: In a past project, I noticed that an application was experiencing increased latency during peak hours, which posed a stability risk. I conducted a performance analysis to identify the root cause, which was related to inefficient database queries. I worked with the development team to optimize these queries and implement caching strategies using Redis. Additionally, I established new SLIs to closely monitor performance metrics. After these changes, we saw a 50% reduction in response times, significantly enhancing system reliability.

Want to get matched with your dream job?

Try InJob.ai for Free and Get Matched 100s of such opportunities!

200+ professionals have found their dream job with InJob.ai this week.

Salary Benefits

Salary details not provided

Want to apply directly?

Apply for the Site Reliability Engineer position at Ford Motor Company in using https://www.linkedin.com/jobs/view/3978681319

Similar Jobs found by InJob.AI

Recent Graduate - Site Reliability Software Engineer

PayPal, San Jose, CA

Software Engineer II

Kforce Inc, Orlando, FL

Site Reliability Engineer L5 - Live Streaming Pipeline

Netflix, Los Gatos, CA

Senior Site Reliability Engineer

LoopNet, Irvine, CA

100% Remote Site Reliability Engineer

Summit Human Capital,

Site Reliability Engineer - Infra and DevOps

Western Digital, San Jose, CA

Site Reliability Engineer (Remote)

Together AI, San Francisco, CA

Site Reliability Engineer

Material Bank®,

Scroll To Top

Frequently asked Questions

Still have a question? Check out our FAQ section below.

InJob searches for the best jobs, based on your profile and automatically generates customized cover letters for you. It saves a lot of hours in your job hunting time.

InJob creates your profile by having a conversation with you to learn about your skills and requirements. It also scans your resume to gather information about your experiences, skills, and achievements. This information is used to craft your profile in the backend which is further used to match jobs and gives you a personalized cover letter for each job opportunity.

InJob searches for job opportunities across a wide range of sources, including LinkedIn, Indeed, and hundreds of other job boards to find hidden gems. Its search is not limited, ensuring it covers as many potential job listings as possible. It also searches the career pages of individual companies that suit your target industry and location and you get applied there.

InJob is constantly active, scanning for fresh job opportunities every single minute. This ensures that you are the first person to apply to new job listings that align with your profile.

InJob plays matchmaker by comparing your profile and resume with job listings. Each job receives a score from 1-10, indicating how well you match with it.

In the upcoming update, Yes, this will be included and this will be the main differentiator. InJob will apply for jobs on your behalf. It will target top matches and craft custom cover letters for each job, ensuring your application stands out. InJob will also handle the application process, including visiting company websites and filling out forms.

In the upcoming update, Yes, InJob will provide an interactive dashboard that serves as mission control for your job search. It will display all the jobs InJob has applied for you and their current status. You will also be able to track which companies have shown interest in your profile and view the feedback they provided.

In an upcoming feature, Yes, InJob will collect all feedback, including positive and constructive feedback, and presents it to you. This will allow you to know exactly where you stand in the job market and provides insights on how to improve your skills.

Site Reliability Engineer at Ford Motor Company

Similar Jobs found by InJob.AI

Frequently asked Questions

Why should I use InJob?

What do you mean by “InJob creates my profile”?

Where does InJob search for job opportunities?

How often does InJob search for new job listings?

How does InJob match me with job listings?

Does InJob apply for jobs on my behalf?

How can I track my job applications with InJob?

Can I see feedback on the job applications?