We are seeking an experienced AI/LLM Data Engineer to build and maintain the data pipeline for our Generative AI platform. The ideal candidate will be well-versed in the latest Large Language Model (LLM) technologies and have a strong background in data engineering, with a focus on Retrieval-Augmented Generation (RAG) and knowledge-base techniques. This role sits in the AI COE within DX Tech & Digital. As a AI/LLM Data Engineer (you will report into the Director, AI Solutions & Development who oversees the AI COE.
You will work on highly visible strategic projects, collaborating with cross-functional teams
to define requirements and deliver high-quality AI solutions.
The ideal candidate will have a passion for Generative AI and LLMs, with a proven track record of delivering innovative AI applications.
Responsibilities
• Design, implement, and maintain an end-to-end multi-stage data pipeline for LLMs, including Supervised Fine Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) data processes
• Identify, evaluate, and integrate diverse data sources and domains to support the Generative AI platform
• Develop and optimize data processing workflows for chunking, indexing, ingestion, and vectorization for both text and non-text data
• Benchmark and implement various vector stores, embedding techniques, and retrieval methods
• Create a flexible pipeline supporting multiple embedding algorithms, vector stores, and search types (e.g., vector search, hybrid search)
• Implement and maintain auto-tagging systems and data preparation processes for LLMs
• Develop tools for text and image data crawling, cleaning, and refinement
• Collaborate with cross-functional teams to ensure data quality and relevance for AI/ML models
• Work with data lake house architectures to optimize data storage and processing
• Integrate and optimize workflows using Snowflake and various vector store technologies
• Master's degree in Computer Science, Data Science, or a related field
• 3-5 years of work experience in data engineering, preferably in AI/ML contexts
• Proficiency in Python, JSON, and related tools
• Strong understanding of LLM architectures, training processes, and data requirements
• Experience with RAG systems, knowledge base construction, and vector databases
• Familiarity with embedding techniques, similarity search algorithms, and information retrieval concepts
• Hands-on experience with data cleaning, tagging, and annotation processes (both manual and automated)
• Knowledge of data crawling techniques and associated ethical considerations
• Strong problem-solving skills and ability to work in a fast-paced, innovative environment
• Familiarity with Snowflake and its integration in AI/ML pipelines
• Experience with various vector store technologies and their applications in AI
• Understanding of data lakehouse concepts and architectures
• Excellent communication, collaboration, and problem-solving skills.
• Ability to translate business needs into technical solutions.
• Passion for innovation and a commitment to ethical AI development.
• Experience building LLMs pipeline using framework like LangChain, LlamaIndex, Semantic Kernel, OpenAI functions.
• Familiar with different LLM parameters like temperate, top-k, and repeat penalty, and different LLM outcome evaluation data science metrics and methodologies.
Preferred Skills
...Poudre Valley Hospital, US:CO:Fort Collins Department: 12 bed Operating Room (OR) - Ortho, General, GYN, Urology, Robotics, ENT/Plastics,... ...top of scope practice in direct patient care utilizing the nursing process Values a multidisciplinary team approach to...
...Job Description Advocate Aurora Health is seeking a Surgical Technologist Level I for a job in Whitefish Bay, Wisconsin. Job Description... ...accredited or approved program in Surgical Technology. Experience Required: ~ No experience required. Knowledge, Skills...
...internal client teams of consultants and other Experts in the execution of Executive Search, Board, and Advisory projects for multiple clients within... ...relevant market shifts in a segment or function, company updates, and people moves Act as a partner with consultants...
...Washington Regional Urgent Care is looking for dedicated and compassionate ARRT Radiology or Limited License X-Ray Techs to join our dynamic healthcare team in Mountain Home, AR. ABOUT WASHINGTON REGIONAL URGENT CARE: Washington Regional Urgent Care, now with six...
...Job Description Elite Medical Staffing is seeking a travel Long Term Care (LTC) RDN - Registered Dietitian Nutritionist for a travel job in Silver Spring, Maryland. Job Description & Requirements ~ Specialty: RDN - Registered Dietitian Nutritionist ~ Discipline...