Role Objective
We are seeking a highly skilled and experienced Data Scientist to join our team. The ideal candidate will have 3-5 years of experience in data science, with a strong background in Deep Learning, Natural Language Processing (NLP), and Generative AI (GenAI). You will be responsible for developing and implementing advanced data models, algorithms, and solutions that enhance our IDP capabilities and provide actionable insights for our business.
Key Responsibilities:
Data Analysis and Modeling
- Conduct exploratory data analysis to understand and summarize document data patterns.
- Develop and implement machine learning models to solve complex business problems.
- Build and optimize deep learning models for document processing applications.
Natural Language Processing (NLP)
- Develop NLP solutions for text classification, data extraction – extractive and abstractive.
- Work with large-scale datasets to extract meaningful insights from text data.
- Implement and fine-tune NLP models using state-of-the-art techniques specifically for document understanding.
- Good to have knowledge in sequence-to-sequence models or transformer-based models for text extraction tasks.
Deep Learning
- Design and implement deep learning models for various document processing tasks.
- Utilize techniques such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), and transformer-based models.
- Optimize model performance through hyperparameter tuning and model architecture experimentation.
Generative AI
- Design and develop generative models for various dimensions such as data generation, synthesis and augmentation.
- Experiment with and implement advanced GenAI techniques to enhance existing solutions
- Experience with on-prem models such as Claude, Mistral and cloud models – OpenAI.
Collaboration and Communication
- Collaborate with cross-functional teams to understand business requirements and deliver data-driven solutions.
- Communicate complex data science concepts and results to non-technical stakeholders.
- Provide technical mentorship to junior data scientists and data analysts.
Research and Innovation
- Stay current with the latest advancements in data science, machine learning, NLP, and GenAI.
- Participate in research and development projects to explore new methodologies and technologies.
- Contribute to the development of data science best practices and standards.
Required Qualifications
Education
- Bachelor’s or Master’s degree in Computer Science, Data Science, Statistics, Mathematics, or a related field.
Experience
- 3-5 years of hands-on experience in data science, with a focus on deep learning, NLP, and generative AI.
Technical Skills
- Proficiency in programming language- Python is a must. Good to have C++ coding experience.
- Experience with machine learning frameworks such as TensorFlow and PyTorch.
- Strong knowledge of NLP libraries and tools such as Spacy, Hugging Face Transformers.
- Familiarity with generative models and techniques.
- Strong understanding of data wrangling and preprocessing techniques.
Soft Skills
- Excellent problem-solving skills and attention to detail.
- Strong communication and collaboration abilities.
- Ability to work independently and as part of a team.
Preferred Qualifications:
- Experience with cloud platforms such as AWS, Azure, or Google Cloud.
- Prior experience in a similar industry or domain.
- Knowledge of OCRs and image processing techniques is an add-on.