Job Description
We are looking for a skilled data professional to support a project focused on generating high-quality visualisations from code-based prompts for AI Data Training. The ideal candidate will have hands-on experience with data visualization tools, a solid understanding of statistical concepts, and the ability to evaluate code and graphical outputs effectively. This contract role is designed for someone who can contribute technical expertise toward improving AI systems through structured content creation and review.
What You’ll Do
Create Prompt-Based Content
Create Responses to Coding Prompts
- Design coding exercises and sample responses to teach models how to generate a range of plots and graphs from structured prompts.
- Develop examples across data analysis scenarios involving basic statistics, distributions, and experimental methods.
- Ensure clarity, correctness, and completeness in both code and the associated visual outputs.
Assess Quality of Model Output
- Review responses generated by AI for correctness in code and visual presentation.
- Identify errors in statistical interpretation or visualization logic, and suggest improvements.
- Provide feedback on the model’s use of plotting libraries and its understanding of core data analysis workflows.
Support Evaluation Criteria
- Validate that visualizations convey appropriate insights based on statistical inputs like averages, variability, and distribution type.
- Check for proper usage of tools such as pandas, matplotlib, and seaborn, as well as structure and readability of code.
- Help refine evaluation guidelines by assessing performance across a range of prompt types.
We’re Looking For
- 7+ years of hands-on experience.
- Background in statistics or data science with applied experience in probability, hypothesis testing, and experimental analysis.
- Proficiency in Python, with strong skills in libraries such as pandas, matplotlib, seaborn
- Ability to interpret and evaluate code written for data visualization.
- Deep knowledge of statistical terms and techniques, including mean, median, standard deviation, and confidence intervals.
Nice to Have
- Exposure to libraries such as scikit-learn, SciPy, or statsmodels.
- Experience working with R or MATLAB in a data analysis or visualization context.
Understanding of basic experimental design (e.g. A/B testing) and its representation through visual data