Data Analysis Portfolio Assessment
Code | School | Level | Credits | Semesters |
ZDAT1003 | Computer Science | 1 | 40 | Spring UK |
- Code
- ZDAT1003
- School
- Computer Science
- Level
- 1
- Credits
- 40
- Semesters
- Spring UK
Summary
This assessment sees apprentices developing their probabilistic reasoning and skills in collecting, analysing and organising data and information using relative statistical techniques. Apprentices will apply their knowledge through practice by conveying their ideas clearly and fluently via three main components (consisting of either a written assignment or statistical report).
The first two components see students applying fundamental techniques in probability and statistics to a number of smaller data science probability and statistics problems. This will comprise of two small written assignments.
The final component will consist of a more substantial written statistical report making use of a statistical software package (R). Apprentices will use real life data to conduct basic qualitative and quantitative analyses. The focus of the analyses should be a grounded interpretation of the datasets with a narrative of how these findings relate to the wider discipline of data science.
The portfolio assessments across this programme provide an opportunity for apprentices to understand how to start structuring a portfolio of evidence to demonstrates the acquisition and application of knowledge, skills and behaviours in line with the apprenticeship standard. One of the conditions for passing through the gateway and on to the end point assessment is a portfolio of evidence which can be used as the basis for the professional discussion in the end point assessment. This therefore helps the apprentices in understanding the expectations and practice in the preparation of an assessed portfolio before reaching the gateway review.
Target Students
Only available to those studying towards the Data Scientist Degree apprenticeship programme
Assessment
- 20% Coursework 1: Probability coursework
- 20% Coursework 2: Statistics Coursework
- 60% Report: Statistical Report
Assessed by end of designated period
Learning Outcomes
Can calculate probabilities and moments using probability mass and density functions
Can calculate simple properties of discrete bivariate random variables
Can state and apply the central limit theorem
Can conduct appropriate hypothesis tests for a variety of situations, and calculate p-values.
Can fit and analyse linear models with one covariate.
Can analyse categorical data by testing proportions and independence of two variables.
Can use statistical software to carry out calculations and interpret the output.
KSBs
K2. How Data Science operates within the context of data governance, data security, and communications. How Data Science can be applied to improve an organisation’s processes, operations and outputs. How data and analysis may exhibit biases and prejudice. How ethics and compliance affect Data Science work, and the impact of international regulations (including the General Data Protection Regulation.)
K3. How data can be used systematically, through an awareness of key platforms for data and analysis in an organisation, including:
1. Data processing and storage, including on-premise and cloud technologies.
2. Database systems including relational, data warehousing & online analytical processing, “NoSQL” and real-time approaches; the pros and cons of each approach.
3. Data-driven decision making and the good use of evidence and analytics in making choices and decisions.
K4. How to design, implement and optimise analytical algorithms – as prototypes and at production scale– using:
2. Advanced and predictive analytics, machine learning and artificial intelligence techniques, simulations, optimisation, and automation.
3. Applications such as computer vision and Natural Language Processing.
4. An awareness of the computing and organisational resource constraints and trade-offs involved in selecting models, algorithm and tools.
5. Development standards, including programming practice, testing, source control.
K5. The data landscape: how to critically analyse, interpret and evaluate complex information from diverse datasets:
1. Sources of data including but not exclusive to les, operational systems, databases, web services, open data, government data, new and social media.
2. Data formats, structures and data delivery methods including “unstructured” data.
3. Common patterns in real-world data.
S1. Identify and clarify problems an organisation
reformulate them into Data Science problems. Devise solutions and make decisions in context by seeking feedback from stakeholders. Apply scientific methods through experiment design, measurement, hypothesis testing and delivery of results. Collaborate with colleagues to gather requirements.
S2. Perform data engineering: create and handle datasets for analysis. Use tools and techniques to source, access, explore, prole, pipeline, combine, transform and store data, and apply governance (quality control, security, privacy) to data.
S3.
and use an appropriate range of programming languages and tools for data manipulation, analysis, visualisation, and system integration. Select appropriate data structures and algorithms for the problem. Develop reproducible analysis and robust code, working
software development standards, including security, accessibility, code quality and version control. S4. Use analysis and models to inform and improve organisational outcomes, building models and validating results with statistical testing: perform statistical analysis, correlation vs causation, feature selection and engineering, machine learning, optimisation, and simulations, using the appropriate techniques
S5. Implement data solutions, using relevant software engineering architectures and design patterns. Evaluate Cloud vs.
deployment. Determine the implicit and explicit value of data. Assess value for money and Return on Investment. Scale a system up/out. Evaluate emerging trends and new approaches. Compare the pros and cons of software applications and techniques.
S6. Find, present, communicate and disseminate outputs effectively and with high impact through creative storytelling, tailoring the message for the audience. Use the best medium for each audience, such as technical writing, reporting and dashboards. Visualise data to tell compelling and actionable narratives. Make recommendations to decision makers to contribute towards the achievement of organisation goals.
S7. Develop and
collaborative relationships at strategic and operational levels, using methods of organisational empathy (human, organisation and technical) and build relationships through active listening and trust development.
S8. Use project delivery techniques and tools appropriate to their Data Science project and organisation. Plan, organise and manage resources to successfully run a small Data Science project, achieve organisational goals and enable effective change.
B1. An inquisitive approach: the curiosity to explore new questions, opportunities, data, and techniques; tenacity to improve methods and maximise insights; and relentless creativity in their approach to solutions.
B2. Empathy and positive engagement to enable working and collaborating in multi-disciplinary teams, championing and highlighting ethics and diversity in data work.
B3. Adaptability and dynamism when responding to varied tasks and organisational timescales, and pragmatism in the face of real-world scenarios.
B4. Consideration of problems in the context of organisation goals.
B5. An impartial, scientific, hypothesis-driven approach to work, rigorous data analysis methods, and integrity in presenting data and conclusions in a truthful and appropriate manner.