Apply for the Site Reliability Engineer position at Material Bank® in . Find the best jobs for you effortlessly with InJob.AI, your ultimate solution for job search. Discover top job opportunities and streamline your job search process.

Job Description
<div> <em> Material Bank is a fast-paced, high-growth technology company and created <strong> the world's largest material marketplace for the Architecture and Design industry </strong> , providing the fastest and most powerful way to start and manage a design project. Learn more about us at www.materialbank.com or see below. <br/> <br/> </em> -- <br/> <br/> Site Reliability Engineers (SREs) are responsible for keeping all user-facing services and other Material Bank production systems running reliably and efficiently. <br/> <br/> SREs in Material Bank specialize in systems (operating systems, storage subsystems, networking), while implementing best practices for availability, reliability and scalability, with interests in algorithms and distributed systems. <br/> <br/> <strong> What you’ll do: <br/> <br/> </strong> <ul> <li> Be on an on-call (OpsGenie) rotation to respond to incidents that impact Material Bank’s availability, and provide support for service engineers with multiple customer incidents. </li> <li> Prevent incidents from reoccurring. </li> <li> Run our infrastructure with, Terraform, GitHub CI/CD, Kubernetes and ECS. </li> <li> Build monitoring that alerts on symptoms rather than on outages using NewRelic and Prometheus. </li> <li> Document EVERYTHING so your findings turn into run-books/SOPs and then into automation. </li> <li> Improve operational processes (such as deployments and upgrades) to make them as uneventful as possible. </li> <li> Design, build and maintain core infrastructure that enables Material Bank scaling to support thousands of concurrent users. </li> <li> Debug production issues across services and levels of the stack. </li> <li> Plan the growth of Material Bank infrastructure. <br/> <br/> <br/> </li> </ul> <strong> What you’ll bring: <br/> <br/> </strong> <ul> <li> Think about systems: edge cases, failure modes, behaviors, specific implementations. </li> <li> Know your way around Linux. </li> <li> Have strong programming skills: Shell, and Python. </li> <li> Collaborate and communicate asynchronously. </li> <li> Document all the things to inform and mentor others. </li> <li> Biased for action </li> <li> Delivering quickly and effectively, and iterating fast. </li> <li> Have experience with Nginx, Container technologies, Kubernetes, Terraform, Kafka or similar technologies <br/> <br/> <br/> </li> </ul> <strong> Projects you can work on </strong> : <br/> <br/> <ul> <li> Coding infrastructure automation with Terraform, and common CI/CD tools </li> <li> Improving our monitoring and building new metrics </li> <li> Helping release managers deploy and fix new versions of Materialbank in all geographies. </li> <li> Plan, prepare for, and execute the provisioning of new infrastructure in our future expansions </li> <li> Develop a relationship with our product and business teams to define their SLAs, iterate on those SLAs and improve their reliability </li> <li> Experience defining SLOs and Error budgets <br/> <br/> <br/> </li> </ul> <strong> <em> What you’ll get from us: <br/> <br/> </em> </strong> <ul> <li> Our people: If you thrive in an inclusive, innovative, and fast-paced organization, look no further! You will get to work alongside some of the brightest minds - Join a genuinely fun and supportive workplace where we keep our employees consistently engaged through internal communication and corporate events </li> <li> Relaxation and Celebrations: Generous PTO, Sick Days, Paid National Holidays, and even more (ask us about this when we connect). </li> <li> Health Benefits: We contribute to your medical, dental, vision and short-term/long-term disability plans and have a strong employee assistance program. </li> <li> Plan for your Retirement: 401(k) eligible after your first 90 day's employed! </li> <li> Giving Back: We sponsor multiple events throughout the year to help out our communities. You will receive time off to give back as well. </li> <li> Growth: We’ll help you take your career to the next level. We want you to be creative and take initiative which will allow you to grow and create within the company. Most importantly, be the best at what matters! </li> <li> Flexible Work Schedules: With business units and employees across the globe, Material Technologies has embraced a hybrid working model allowing department leaders to decide on the best approach for their respective teams, whether that be remote, in person, or a little of both. <br/> <br/> <br/> </li> </ul> <strong> About Material Bank <br/> <br/> </strong> Material Bank is the world’s largest material marketplace for the architecture and design industry, providing the fastest and most powerful way to search and sample materials. Material Bank connects design professionals to hundreds of manufacturers through facilitating brand discovery, rep engagement, and material sampling. <br/> <br/> Material Bank has transformed the way an entire industry discovers and samples materials. By removing the friction that exists in the process, we drive business between architects and designers (members) and our Brand Partners (clients). <br/> <br/> Our powerful material database and proprietary robotic distribution facility allow members to order samples until midnight (ET) to be delivered free of charge anywhere in the US, in one box, by 10:30 AM the next morning. <br/> <br/> Connect with us and discover your career at Material Bank. <br/> <br/> -- <br/> <br/> Material Bank is proud to be an equal opportunity employer. We value diversity, and all applicants will be considered for employment without attention to race, color, religion, sex, sexual orientation, gender identity, age, national origin, veteran or disability status or other status protected under any applicable federal, state or local law. </div>
AI Powered Job Insights
Exciting opportunity for a Site Reliability Engineer at Material Bank, a leader in the architecture and design industry! They are focused on ensuring high availability of their user-facing services and production systems while leveraging cutting-edge technology. 📍 Location: Remote/Hybrid 💼 Position: Site Reliability Engineer ⏰ Type: Full-time 📅 Date Posted: 2024-07-18 Role Summary: - Responsible for maintaining reliability and efficiency of production systems. - Specializes in systems including operating systems, storage, and networking. What You'll Do: - Participate in an on-call rotation to manage incidents impacting availability. - Prevent recurring incidents and enhance system reliability. - Utilize tools like Terraform, GitHub CI/CD, Kubernetes, and ECS for infrastructure management. - Develop monitoring systems that focus on symptoms instead of outages. - Document processes to create run-books, SOPs, and automation. - Debug production issues and plan infrastructure growth. What's Needed: - Strong knowledge of Linux and programming (Shell, Python). - Experience with technologies such as Nginx, Container technologies, Kubernetes, Terraform, or Kafka. - Ability to collaborate asynchronously and mentor others through documentation. - Focus on delivering quickly and iterating effectively. Material Bank offers a supportive work culture with flexible schedules, generous benefits including PTO, health coverage, 401(k) plans, and opportunities for community involvement and personal growth. This role is perfect for someone eager to take on challenges in a fast-paced environment and contribute to a transformative service in the design industry.
Top Interview Questions
A: To ensure system reliability, I implement a robust monitoring solution that alerts me to issues before they escalate into outages. I use metrics to understand traffic patterns and peak loads, allowing for proactive scaling of resources. Additionally, I regularly conduct incident response drills and post-mortems to learn from past incidents and refine our operational processes.
A: I have extensive experience using Terraform to automate the deployment and management of cloud infrastructure. For example, I designed a full stack application deployment, which included setting up VPCs, subnets, EC2 instances, and RDS databases. By using Terraform's modular approach, I ensured that our infrastructure could be versioned and easily replicated across different environments, improving consistency.
A: In incident management, I follow a structured approach: first, I assess the severity and impact, then take immediate action to restore service. Post-incident, I conduct a thorough investigation to determine root causes, documenting all findings. I then create run-books and Standard Operating Procedures (SOPs) to ensure similar issues can be prevented in the future, and I prioritize communication with affected stakeholders throughout the process.
A: To handle scaling challenges, I first analyze the system's bottlenecks using performance metrics from tools like Prometheus and NewRelic. I then implement horizontal scaling techniques, ensuring that load balancers efficiently distribute traffic. Additionally, I routinely review our applications for inefficiencies and optimize them to handle increased loads, while ensuring that distributed storage solutions are reliable and performant.
A: I believe in documenting processes thoroughly and clearly to support knowledge sharing within the team. I start by outlining each step in the process, then expand upon each step with detailed instructions, examples, and links to relevant resources. I also use collaborative tools to gather feedback from colleagues, ensuring that the documentation is user-friendly and up-to-date. This practice not only helps in onboarding new team members but also improves overall operational efficiency.
Want to get matched with your dream job?
Try InJob.ai for Free and Get Matched 100s of such opportunities!
200+ professionals have found their dream job with InJob.ai this week.

Salary Benefits
Salary details not provided

Want to apply directly?
Apply for the Site Reliability Engineer position at Material Bank® in using https://www.linkedin.com/jobs/view/3979378301


PayPal, San Jose, CA
Kforce Inc, Orlando, FL
Netflix, Los Gatos, CA
LoopNet, Irvine, CA
Summit Human Capital,
Western Digital, San Jose, CA
Together AI, San Francisco, CA
Ford Motor Company,
Still have a question? Check out our FAQ section below.
