AI Container Platform Engineer
Job Description
Intel is hiring a senior AI Container Platform Engineer for development and optimization of its AI infrastructure. This includes collaboration with other team members to design and build a highly resilient Kubernetes container platform that is optimized for AI workloads on Intel accelerator hardware. Successful candidates will have a strong desire to address challenges with an attention to detail and focus to build a highly reliable and scalable container platform. A foundational understanding of software development principles and testing methodologies along with a high degree of independence, adaptability, drive, and willingness to accept new challenges, are a must. Candidate should also have both demonstrated software development skills in a range of languages and strong Linux systems expertise.
Responsibilities:
- Work with others to design, build, and maintain a Kubernetes container platform on bare metal on-premises hardware.
- Participate in building advanced tooling for deployment, testing, monitoring, logging, administration, auditing, and operations of multiple Kubernetes clusters in distributed data centers.
- Research and implement solutions related to Kubernetes container RBAC, networking, storage, scheduling, registries, certificate management, and more to build a highly reliable, scalable, secure, and resource-optimized AI container platform.
- Evaluation and selection of third-party commercial and open-source components for the AI container platform
Qualifications
You must possess the below requirements to be initially considered for this position. Preferred qualifications are in addition to the requirements and are considered a plus factor in identifying top candidates. Experience listed below would be obtained through a combination of your schoolwork and/or classes and/or research and/or relevant previous job and/or internship experiences.
Minimum Qualifications:
The candidate must possess a Bachelor’s degree or Master’s degree in Computer Engineering, Computer Science, Information Systems, or a related field with 8+ years of relevant work experience.
5+ years of experience in below areas:
- Python, Golang or another modern programming language
- Linux based operating systems such as CentOS, Ubuntu, SUSE, or Rocky
- Bash shell scripting and Linux command-line acumen
2+ years of experience in below areas:
- Software engineering team in a Cloud or on-premises data center environment supporting critical services.
- Linux containers and container runtimes (Docker, containerd, cri-o)
- Kubernetes
- IP networking, load balancing, DNS
- Pod scheduling and node topology management
- Environment As Code via configuration management tools such as ansible, terraform, salt, chef, or puppet.
- Container Network Interface (CNI), Container Storage Interface (CSI), and Kubernetes schedulers
- Istio and/or service meshes.
- AI/ML workloads
- Performance benchmarking
- Hardware accelerators and specialized devices (GPU, HPU, HPC)
- Git development workflow
- Kubespray, Kops, or Kubadm
Preferred Qualification:
- Slurm, Volcano, MPI, PyTorch, TensorFlow or other schedulers and AI domain frameworks
- On-premises data center networking
- Cloud development or architecture (AWS, GCP, Azure, etc.)
- Secret vault integration with Kubernetes
- Identity provider configuration with SSO
- Ability to communicate detailed technical concepts in a clear and concise manner.
Inside this Business Group
The Data Center & Artificial Intelligence Group (DCAI) is at the heart of Intel’s transformation from a PC company to a company that runs the cloud and billions of smart, connected computing devices. The data center is the underpinning for every data-driven service, from artificial intelligence to 5G to high-performance computing, and DCG delivers the products and technologies—spanning software, processors, storage, I/O, and networking solutions—that fuel cloud, communications, enterprise, and government data centers around the world.Other Locations
US, OR, Hillsboro; US, AZ, Phoenix; US, CA, Folsom; US, CA, San Diego; US, CA, Santa ClaraPosting Statement
All qualified applicants will receive consideration for employment without regard to race, color, religion, religious creed, sex, national origin, ancestry, age, physical or mental disability, medical condition, genetic information, military and veteran status, marital status, pregnancy, gender, gender expression, gender identity, sexual orientation, or any other characteristic protected by local law, regulation, or ordinance.Benefits
We offer a total compensation package that ranks among the best in the industry. It consists of competitive pay, stock, bonuses, as well as, benefit programs which include health, retirement, and vacation. Find more information about all of our Amazing Benefits here.Annual Salary Range for jobs which could be performed in US, California: $186,552.00-$279,772.00
*Salary range dependent on a number of factors including location and experience
Working Model
This role will be eligible for our hybrid work model which allows employees to split their time between working on-site at their assigned Intel site and off-site. In certain circumstances the work model may change to accommodate business needs.Maggie Offensive Security Researcher
Saya sentiasa mahu melakukan sesuatu yang mengubah dunia — di Intel, saya rasa dihargai, dan saya semakin yakin pada diri saya sendiri. Ia membuatkan saya berasa seperti saya mampu melakukan perkara yang hebat.
- Lead Systems Architect/Engineer Haifa, Israel Mohon sekarang
- Mechanical Engineer - Kiryat Gat Kiryat Gat, Israel Mohon sekarang
- IT Graduate Trainee Malaysia Mohon sekarang
Anda belum mempunyai Kerja Disimpan lagi.
Lihat Semua PekerjaanSertai komuniti bakat kami
Jadilah orang pertama untuk mendapatkan berita perkembangan terkini di Intel! Daftar untuk menerima berita terkini.