ML Optimization/ Deployment Researcher

Job Description

The Machine Learning optimization and deployment researcher is responsible for designing and implementing optimization strategies for the refinement and deployment of deep learning models. He/she will research and implement model compression techniques such as quantization, pruning and distillation. The job scope also includes developing custom deployment techniques for specific inference engines and optimizing model performance in-situ in a deployed product environment. The candidate will cooperate with other teams to integrate the optimized models into production systems and to refine and scale the methods used for assessing the functional performance of models.

Key Responsibilities:

  • Optimize AI models for deployment in a wide range of target architectures including desktop, cloud, browsers and mobile devices

  • Develop and implement algorithms and software for efficient real-time and offline inference

  • Monitor and evaluate the performance of models in a production context and optimize them for accuracy, speed and compute resource efficiency

  • Design and implement custom tooling and strategies for the development and deployment of optimized deep learning models

  • Research and implement appropriate techniques to optimize deployment for product

  • Work closely with cross-functional teams, including product managers, engineers and researchers, to understand their workflows and design and implement optimized model deployment techniques

  • Desired Background

  • Ph.D. or master s with 4 years of experience in Computer Science or similar, with a focus on deep learning.

  • Strong publication record, with publications in major machine learning conferences (e.g. NeurIPS, ICLR, ICML, etc.).

  • Strong theoretical and practical background in AI technologies.

  • Experience with exploration of new technologies: Researching and staying up to date with the latest machine learning technologies, frameworks and inference engines.

  • Experienced in optimizing algorithms and software architectures in constrained environments.

  • 3 years of experience with frameworks such as PyTorch/Onnx/TensorFlow/etc

  • 3 years of experience with programming languages such as Python, C/C++, or Matlab

  • Experience with Embedded systems, Computer Architecture, high-performance computing

  • Experience optimizing ML models for Inference using hardware acceleration is a plus

  • Experience working in a software development team and making use of good software practice, for example VCS and CI.

  • Desired: Background in web technologies for real time processing. Basic knowledge of Audio/Video formats.

  • Strong analytical and problem-solving skills

  • Strong communication skills and ability to work well in a team environment

  • Key Skills

    Computer vision; C++; Image processing; Analytical; Machine learning; Digital signal processing; MATLAB; Analytics; Python

    About Company

    Were the rain on the roof in a movie. The music flowing through your earbuds when youre at the gym. The footsteps lurking behind you in a video game. The voice of a colleague on a call who seems to be right next to you. The sight of a breathtakingly bright and vivid sunset on your TV.Making experiences come alive through technology is what we do. Its been our mission since day one.

    Apply for the Job

    Max file size 10MB.
    Upload failed. Max size for files is 10 MB.
    Thank you! Your submission has been received!
    Oops! Something went wrong while submitting the form.