Job Description
We seek an innovative AI Engineer to join our team and lead the development of scalable solutions using open-source technologies, LLM APIs, and advanced AI techniques. The ideal candidate will excel in designing RAG, Graph RAG, Agent Systems with function calling, and fine-tuning/customizing LLMs. Proficiency in hosting open-source models (e.g., Llama 2, Mistral) and integrating APIs (OpenAI, Anthropic, etc.) is critical, along with experience in Python frameworks like Fast API/Flask for production-grade deployments.
Key Responsibilities:
Architect, build, and optimize AI solutions using open-source models (e.g., Hugging Face, Ollama) and third-party LLM APIs.
Design and implement advanced techniques including RAG, GraphRAG, Agent Systems with orchestration/function calling, and fine-tuning/prompt-tuning of LLMs.
Deploy and manage self-hosted open-source models (e.g., via vLLM, TensorRT-LLM) with scalable APIs.
Collaborate with teams to integrate AI/ML solutions into production systems using FastAPI, Flask, or similar frameworks.
Develop automation pipelines for data retrieval, preprocessing, and model evaluation, ensuring alignment with business use cases.
Stay ahead of AI trends (e.g., open-source LLM advancements, cost-efficient scaling) and drive strategic adoption.
Ensure robust monitoring, testing, and documentation of systems for reliability and reproducibility.