Statistics Assignment Help

Statistics Assignment Help








There are 3 parts to hand in.

1.      A PDF report compiled by Rmark down of no more than 20 pages, including figures (which must be at least 1/3 of a page in size each) and key code blocks. Note the expectation is that there is not much more than 3-4 pages of actual mathematics and text, and that almost all space is taken up by figures, code and other R output.

2.      An electronic R markdown file entitled with your anonymous student code and then ‘.Rmd’ used to generate the report (a link and submission instructions will appear on ELE in due course). The R in this script may be run to help assess your mark, so please ensure that all code blocks run in a blank R session and the markdown knits in a folder containing only the data given to you and any Stan files you used.

3.      Any Stan files you have used.

1. The data USelection Data.rdscontains the results of the 2020 US general election by state as well as the number of Covid-19 cases and deaths on election day in each state, and a range of population demographics. State breakdowns of sex and race of the population are given as per- centages of the total population (TotalPop). So, for example, the first row states that Alabama is 48.46% male, 4.00% Hispanic, etc. The percentage of the state population that are US citi- zens is given by the variable Citizen and the percentage in poverty by Poverty. It is not clear how poverty is defined, nor whether there is a state-dependence for the definition (e.g. that is tied to minimum wage and living costs which are not given and vary by state). The number Employed and the income per capita per state are given. Finally, the variables Professional, Service, Office, Construction, Production give percentages of each state’s employed popu- lation that work in each of these areas. Professional represents, e.g. Doctors, Lawyers, Teachers etc. Service represent service industries such as retail and hospitality. Office represent office workers, Construction, those in the building industry (including plumbers, electricians etc) and finally Production represents factory work.

As the purpose of this coursework is not to test your R coding ability, the file SomeDataWrangling.Rattempts to shape the data in some of the different ways that you may wish to in order to answer the question. There is no ”one correct way” to conduct your investigation, and each attempt will be marked on its own merits, hence any of the manipulations presented may be useful. The code may also be amended if you think of some other way you would like to shape the data for analysis. If you are having issues with data shaping, I am willing to help, as long as I don’t cross the line into suggesting which models you should/should not fit.

Filtering the data to only consider the race between candidates Biden and Trump, conduct a Bayesian analysis to answer the question: How important might Covid-19 have been in influencing state results during the 2020 US presidential election?

Brief: Your analysis must

·         Select and justify appropriate Bayesian models to fit. Your model should predict some variable representing the results using some of the other variables including those related to Covid-19. Note that the small size of the data means that you cannot use all of the variables at once, so the choices you make require careful justification and sensitivity analysis (see later).

·         Fit any Bayesian models using Stan.

·         Establish the convergence of any models you fit.

·         Check the quality of your models using appropriate out of sample data.

·         Conduct a sensitivity analysis to show that your conclusions are not sensitive to your model.

·         Use Monte Carlo with accurate Monte Carlo error bounds to make probabilistic inferential statements. These must be used both to assess posterior probabilities for effects directly related to Covid-19, and to compare the size of these effects with others in your model.

Important: It is possible to spend far too much time completing this coursework by trying to perfect your

model. It is possible to achieve top marks by choosing just one model with careful justification and, as long as it converges and looks reasonable (it won’t look perfect), to explore just a very small number of alternatives when exploring sensitivity and to critically evaluate it well (why it might not look reasonable). There are no extra marks for finding a ”best” model, or for exploring all possible alternatives to the choices you made in your sensitivity analysis, so please focus your efforts accordingly. 

Answer Detail

Get This Answer

Invite Tutor