Principal Data Scientist

Boston, Massachusetts
Full time
Posted
employer logo
One Door
I.T., digital & online media services
51-100 employees
6 jobs
Apply on company site
We are looking for a skilled and proactive Data Scientist/GenAI LLM Engineer to play a pivotal role in driving innovation within our organization and transforming retail visual merchandising. This role will play a key part in driving high-impact projects for leading organizations across specialty retail, logistics, supply chain, and beyond.

At the cutting edge of AI/ML technologies, we empower our clients to harness the value of unstructured data and uncover hidden insights within their enterprise information. As an LLM engineer, you’ll bring your deep expertise in LLM/GenAI technologies to the table. In collaboration with our R&D team, product managers, and engineering leads, you'll prototype, build, test, and scale innovative products powered by GenAI/LLM technologies. You'll also be instrumental in fine-tuning model hyperparameters, optimizing configurations, and ensuring the highest level of model performance to drive impactful outcomes for our clients.

RESPONSIBILITIES

  • Develop LLM solutions on customer data, such as RAG architectures on enterprise knowledge repos, querying structured data with natural language, agents, and content generation.
  • Develop end-to-end AI/ML solutions using Python, LLM/GenAI frameworks and tools.
  • Develop CI/CD pipelines, containerize LLM models, and deploy them on cloud or on-premise. Ensure support and maintenance for all LLM/ML model lifecycle stages, including developing training datasets, fine-tuning, testing, deployment pipelines, and ongoing deviation monitoring.
  • Design prototypes and POCs to showcase feasibility and value; provide architectural solutions.
  • Research, design, build, and train innovative LLM applications to address complex real-world problems.
  • Offer technical guidance to clients implementing LLM technologies.

QUALIFICATIONS

  • Bachelor’s Degree (final-year students may apply) in Statistics, Applied Mathematics, Computer Science, or a related field
  • 3+ years of hands-on experience with Python; 2+ years of experience with command line scripting; 1+ years of experience building and maintaining scalable API solutions
  • 2+ years of professional experience with NLP; 1+ years of professional experience with Large Language Models (LLM)/GenAI technology (e.g., OpenAI API, GPT-4, Gemini, Llama, Claude, Amazon Bedrock, Langchain, HuggingFace Transformers, PyTorch); 1+ years of experience with prompt engineering and vector databases
  • 2+ years of experience with AWS, GCP, or Microsoft Azure; 2+ years of experience with MLOps and CI/CD pipeline development, containerization, and model deployment in test and production environments
  • Team player who can communicate complex LLM capabilities and limitations to non-technical stakeholders.

PREFERRED

  • Master’s or Ph.D. in a relevant field
  • 7+ years of product engineering and/or data science experience
  • Experience with Ruby on Rails, JavaScript, or Flutter; 2+ years of experience with Snowflake or Databricks
  • Deep knowledge of a Retail domain or industry, with a focus on NLP/LLM
  • In-depth understanding of Responsible AI standards and protocols
  • Applied research background using frameworks to build LLM prototypes; knowledge of best practices for production LLM development