About Deccan AI
Deccan AI is a high-growth, venture-backed AI model training and evaluation company headquartered in the Bay Area. Founded by alumni of IIT Bombay, IIM Ahmedabad, and ex-Google, we partner with the world’s top AI frontier labs including Google DeepMind, Snowflake, and several cutting-edge research groups. We are backed by Prosus Ventures, and our India office is based in Hyderabad.
We’re not just participating in the AI race we’re building the infrastructure that powers it.
With 1M+ global experts, advanced automation, and vertically integrated platforms, we deliver the gold-standard data that world-class AI models rely on. The AI data annotation market is exploding set to quadruple by 2032. The opportunity? Massive, and you can help define the future.
Job Description:
1. Background
The ML Researcher plays a critical role in keeping the organization at the forefront of AI innovation through deep, end-to-end research. This role is focused on identifying emerging research directions, designing novel AI benchmarks, and creating high-quality evaluation datasets aligned with both cutting-edge research and organizational priorities.
The ML Researcher bridges the gap between theoretical research and practical application by translating insights from the latest AI papers into actionable benchmarks and evaluation frameworks. The role emphasizes deep research, synthesis of academic and industry findings, and the design of solutions that meaningfully advance AI systems—particularly in agent frameworks, machine learning models, and emerging AI paradigms.
2. Purpose of the Role
The primary objective of the ML Researcher is to conduct comprehensive AI research and design novel, scalable, and real-world–relevant benchmarks and evaluation datasets. The role aims to push beyond existing evaluation practices by defining new criteria that better measure the capabilities and limitations of modern AI systems.
3. Key Responsibilities
1. Research & Literature Review
Continuously track the latest AI research, conferences, and publications.
Conduct in-depth literature reviews across emerging AI models, agent frameworks, and evaluation methodologies.
Identify gaps, limitations, and opportunities in existing benchmarks.
Assess current industry benchmarks for relevance to organizational and client goals.
2. Benchmark & Evaluation Design
Propose novel AI benchmarks informed by research insights and industry gaps.
Design evaluation datasets that assess AI systems across both coding and non-coding domains.
Ensure benchmarks and datasets are innovative, scalable, and practically applicable.
Define meaningful evaluation metrics that provide actionable insights into model performance.
3. Documentation & Research Deliverables
Create high-level requirement documents for each benchmark or dataset, covering:
Problem statement and motivation
Design overview and structure
Evaluation metrics and success criteria
Testing and validation guidelines
Ensure documentation is clear, comprehensive, and implementation-ready.
4. Cross-Functional Collaboration
Work closely with the ML Lead, pipeline-focused MLEs, and project managers to align research outputs with organizational priorities.
Collaborate with internal stakeholders to ensure benchmarks meet project goals and industry standards.
Provide feedback and iterative improvements based on implementation outcomes.
5. Continuous Innovation & Iteration
Refine and evolve benchmarks and datasets based on feedback, new research, and emerging use cases.
Propose enhancements to existing evaluation methods, including new metrics or benchmark variations.
Stay actively engaged with the AI research community through conferences, discussions, and ongoing learning.
4. Deliverables
The ML Researcher is expected to produce high-impact, actionable research outputs, including:
1. Benchmark Proposal Documents
Each proposal should include:
Clear problem definition and motivation
Detailed benchmark design and structure
Defined evaluation metrics
Testing and validation guidelines
Explanation of novelty compared to existing benchmarks
2. Evaluation Dataset Designs
Dataset overview, structure, and intended use
(Optional) Data collection, labeling, and cleaning methodology
Evaluation methodology and expected outcomes
Unique or differentiating features of the dataset
3. Research Reports & Whitepapers (Optional)
Periodic summaries of research findings or emerging trends
Internal documentation of best practices and benchmarking insights
4. Feedback & Iteration Reports
Post-implementation assessments of benchmark effectiveness
Recommendations for improvements based on team and client feedback
Iterative updates aligned with new research and usage insights
5. Timeline & Milestones
Weeks 1–2:
Complete initial research and first benchmark + dataset proposal
Present draft documentation for review
Weeks 3–4:
Deliver at least one new benchmark and one evaluation dataset per month
Submit finalized documentation for internal review
Weeks 4–6:
Incorporate feedback and finalize benchmark and dataset proposals
Complete testing and validation guidelines
Ongoing:
Regularly iterate on existing benchmarks
Provide monthly research summaries on emerging AI trends impacting evaluation
6. Expected Output & Impact
Innovative Benchmarks & Datasets: Continuous delivery of novel, high-quality evaluation frameworks.
Strong Documentation: Clear, actionable requirements enabling efficient implementation.
Industry Impact: Contributions that elevate evaluation standards and strengthen the organization’s research credibility.
Client Enablement: Research-driven inputs that support effective, client-facing AI solutions.
7. Seniority & Skillset Requirements
The ideal ML Researcher will have:
Deep expertise in AI/ML domains including agent frameworks, deep learning, NLP, computer vision, and emerging AI systems.
Proven experience designing benchmarks and evaluation datasets from the ground up.
Strong research and analytical skills with the ability to translate papers into practical solutions.
Excellent documentation and communication abilities.
Ability to work independently while collaborating effectively with cross-functional teams.
Interview Process:
| Round | Focus | What is evaluated |
| Round 1 | Project Deep Dive | Real project experience and technical understanding |
| Round 2 | Execution & Technical Depth | Practical ML/LLM implementation ability |
| Round 3 | Culture & Team Fit | Communication, mindset, and team compatibility |