Job Responsibilities
Architect and lead the development of scalable, secure AI infrastructure on cloud-native platforms to support autonomous driving technologies Collaborate closely with ML teams to facilitate seamless integration and optimal performance of AI algorithms Identify and address system bottlenecks and instabilities, applying innovative solutions to enhance system reliability and efficiency Foster technological advancements through research and implementation of state-of-the-art AI tools and methodologies Act as a key technical leader and mentor, promoting a culture of technical excellence and collaborative innovation within the AI infrastructure team
Job Requirements
Minimum Skill Requirements: Bachelor's or Master's in Computer Science, Engineering, or related technical field 5 years + of experience in in designing, deploying, and managing GPU clusters for high-performance computing in AI applications, particularly within cloud environments Proficient in cloud services (AWS, Azure, ALI Cloud) and building containerized applications using Kubernetes and Docker Strong programming skills in Python, Golang, and experience with AI/ML frameworks (TensorFlow, PyTorch) Preferred Skill Requirements: Expertise in designing and managing high-availability, high-throughput systems that support machine learning and deep learning workloads Demonstrable leadership skills with a track record of mentoring and leading technical teams In-depth understanding of data structures, algorithms, and software engineering principles relevant to AI and autonomous systems
Required Languages
English
Job Details
Position type
Other
Experience
5~10 years