Statistics and Probability Modelling Assessment
Code | School | Level | Credits | Semesters |
ZDAT2002 | Computer Science | 2 | 40 | Full Year UK |
- Code
- ZDAT2002
- School
- Computer Science
- Level
- 2
- Credits
- 40
- Semesters
- Full Year UK
Summary
In this assessment Apprentices will demonstrate their knowledge and skills associated with probabilistic and statistical modelling by conducting applied statistical research and forecasting methods. This will include data analysis, making predictions and interpretating statistical results making use of a statistical software package.
The assessment will comprise of
- A (continuous) R-portfolio.
- Group presentation – based on probability models and estimating parameters (this will include an individual reflective log).
- A Statistical report – demonstrating statistical modelling and reasoning using linear methods.
- A business case report based on the application of Time Series and Forecasting.
Apprentices will be encouraged to relate their outputs to areas of data science applicable to their current jobs. They will develop skills of scientific reasoning and analysis, and transferable skills of communication. The range of skills will be assessed through apprentices’ written work and oral presentations.
Target Students
Only available to those studying towards the Data Scientist Degree apprenticeship programme
Co-requisites
Modules you must take in the same academic year, or have taken in a previous year, to enrol in this module:
Assessment
- 15% Presentation: Group oral presentation
- 30% Report 1: Statistical Report
- 30% Report 2: Business case report
- 25% Assignment: R Portfolio
Assessed in both autumn & spring semest
Learning Outcomes
Can apply methods concerning standard statistical models; in particular the method of moments and the maximum likelihood method.
Can perform statistical hypotheses tests using data from studies (such as t and F-tests, comparison of models and parameter values).
Can apply methods for interval estimation; in particular, exact and approximate confidence intervals based on asymptotic theory.
Can apply methods for analysing categorical data and methods without having to make distributional assumptions (non-parametric statistics).
Can fit a linear model to data, both manually and using statistical software.
Can check model fit, diagnose errors, and perform model selection amongst the class of linear models.
Can demonstrate and apply a deeper understanding of continuous random variables and their applications in the field of data science.
Can identify and formulate problems in terms of probability and solve them to build up a simple stochastic model.
Understands and can apply basic properties of discrete-time Markov chains.
Can carry out initial data analysis of time-series data and be able to identify and remove simple trend and seasonality.
KSBs
K3. How data can be used systematically, through an awareness of key platforms for data and analysis in an organisation, including:
K4. How to design, implement and optimise analytical algorithms – as prototypes and at production scale– using:
1. Advanced and predictive analytics, machine learning and artificial intelligence techniques, simulations, optimisation, and automation.
2. Applications such as computer vision and Natural Language Processing.
3. An awareness of the computing and organisational resource constraints and trade-offs involved in selecting models, algorithms and tools.
4. Development standards, including programming practice, testing, source control.
S1. Identify and clarify problems an organisation faces, and reformulate them into Data Science problems. Devise solutions and make decisions in context by seeking feedback from stakeholders. Apply scientific methods through experiment design, measurement, hypothesis testing and delivery of results. Collaborate with colleagues to gather requirements.
S2. Perform data engineering: create and handle datasets for analysis. Use tools and techniques to source, access, explore, prole, pipeline, combine, transform and store data, and apply governance (quality control, security, privacy) to data.
S3. Identify and use an appropriate range of programming languages and tools for data manipulation, analysis, visualisation, and system integration. Select appropriate data structures and algorithms for the problem. Develop reproducible analysis and robust code, working in accordance with software development standards, including security, accessibility, code quality and version control.
S4. Use analysis and models to inform and improve organisational outcomes, building models and validating results with statistical testing: perform statistical analysis, correlation vs causation, feature selection and engineering, machine learning, optimisation, and simulations, using the appropriate techniques for the problem.
S5. Implement data solutions, using relevant software engineering architectures and design patterns. Evaluate Cloud vs. on-premise deployment. Determine the implicit and explicit value of data. Assess value for money and Return on Investment. Scale a system up/out. Evaluate emerging trends and new approaches. Compare the pros and cons of software applications and techniques.
B1. An inquisitive approach: the curiosity to explore new questions, opportunities, data, and techniques; tenacity to improve methods and maximise insights; and relentless creativity in their approach to solutions.
B3. Adaptability and dynamism when responding to varied tasks and organisational timescales, and pragmatism in the face of real-world scenarios.
B5. An impartial, scientific, hypothesis-driven approach to work, rigorous data analysis methods, and integrity in presenting data and conclusions in a truthful and appropriate manner.