eStaff Search Group is looking for an HPC/AI Engineer to oversee and optimize our High-Performance Computing (HPC) and AI infrastructure. This role requires expertise in HPC, AI, and system architecture, along with the ability to lead complex projects from design to implementation. The ideal candidate will ensure system scalability, efficiency, and peak performance while collaborating with cross-functional teams to drive innovation.
Job Details:
Contract to hire position Salary is approx. $100,000 to $150,000
Duties/Responsibilities:
Infrastructure Support & Leadership: Maintain and optimize HPC and AI infrastructure, assisting teams with the implementation, tuning, and optimization of tools for Generative AI models.
Performance Optimization: Analyze and enhance system performance, ensuring efficient execution of AI models and HPC applications. Manage and optimize GPU-enabled computing resources for parallel processing, distributed computing, and resource management.
Software Integration & Optimization: Develop, debug, and maintain software tools, libraries, and frameworks supporting HPC and AI workloads. Work closely with vendors to ensure AI models are optimized for scalability and performance.
NVIDIA Tools & Frameworks: Manage and utilize NVIDIA tools such as CUDA, DNN, and TensorRT to optimize AI and HPC workloads on NVIDIA GPUs.
HPC Systems Management: Deploy and manage on-premise HPC and AI systems, ensuring seamless integration with existing IT infrastructure. Handle installation, configuration, and maintenance of HPC environments in data centers and COLO facilities.
Collaboration & Mentorship: Work with cross-functional teams, including alliance partners, data scientists, researchers, and software developers to address complex AI challenges. Provide mentorship and training to junior engineers.
Continuous Learning & Research: Stay updated with the latest advancements in HPC and AI technologies. Research new methodologies and integrate them into existing systems as needed.
Technical Support & Troubleshooting: Diagnose and resolve complex technical issues related to HPC and AI infrastructure, performing root cause analysis and implementing preventive measures.
Documentation & Reporting: Maintain comprehensive documentation for system designs, performance metrics, and project progress. Prepare detailed technical reports and presentations for stakeholders.
Education and Experience:
2+ years of professional experience in supporting and managing HPC and AI architectures, with a proven track record of successful project implementations.
3+ years of Python programming experience, along with expertise in at least one additional scripting or programming language. Experience with Python package management and dependency debugging is a plus.
1+ year of experience in Data Management, including data storage solutions, file systems, and data transfer protocols.
Bachelor’s degree in Artificial Intelligence, Data Science, or a related field.
APPLY TODAY!!
eStaff Search Group is focused on partnering with talented professionals in Architecture & Development, Database Design & Administration, Network & Systems Administration, Software QA & Test, Machine Learning, Mobile Apps, Big Data, Digital/Interactive – throughout the Pittsburgh area – to find you your proper career fit. Our niched focus helps us find you the right company that can help you grow in your career. We take the time necessary with every employee to nurture your passion as it becomes ours. Our extensive benefits package includes Matching 401K, Medical, Dental, Vision, Short Term Disability, Direct Deposit and much more. Please check out our benefits at https://jobs.careermovesnow.com/us/en/benefits-yes-