Tuesday, 5 November 2024

Simulating the 2024 United States Electoral College

 

One of the biggest elections in modern times is today. All the polls out have Harris and Trump at a toss up. I got the data from Real Clear Politics and 270towin.com and ran a MATLAB simulation of the results. I used a Monte Carlo simulation with an error of 1 percentage point. I used the latest 2024 EC vote allocations for the states, which seem to benefit Trump, as when I used the 2020 allocations with the same polling Harris edged out Trump by 0.5. 






Here I ran the simulation 100,000 times (20,000 times more than Nate Silver's simulation). I've even simulated the seats from Nebraska and Maine where they are split in congressional districts. For states that do not have public polling I used the results of the 2020 election. All of those states had the winner by a very comfortable margin. 

Trump ends up winning by 2.7 EC votes taking the average from RCP and 270towin. 


However, adding the new Selzer poll for just Iowa I get a very different result: 


Now Harris wins by nearly 10 EC votes. Both of these simulations have an error of 1 percentage point at 2 standard deviations. 


However, we know polls are wrong. Now I simulated errors in all the polls and ran each simulation 100,000 times for a total of 10,000,000 simulations. I get the following animated GIF: 





Here we see that Harris starts to pull away and wins very comfortably as the polls have ever increasing errors. This makes sense as republicans usually win by just a few points and a few point swing hurts republicans more than democrats. 


Let's hope the wannabe dictator loses so he can finally be prosecuted for his crimes. 









Sunday, 22 March 2020

Growth Rates


Here I a looked at the growth rates for several countries. After downloading the data, I plotted the number infected on a log plot after the 100th infection.


This graph is a bit complex at initial glance, but it can be easily distilled. The horizontal axis is the days since the 100th case. The vertical axis is a log scale ranging from 100 to 80,000. Also notice there are diagonal white dashed lines. These  lines are the growth rates for the number infected in a country to double. So if a country is tracking between the 2 and 3 days lines it means it takes between 2-3 days for the number infected to double.

Notice how China and South Korea have curves which are horizontal. The data only goes to the 25th day post their 100th infection, the most current data points are not plotted as they are off the graph.

If we remove some countries we can see a few countries of my personal interest.



Of note here is that the USA is crossing the Italy curve, which means the USA is now increasing its numbers of infections compared to Italy when it was at this point. Right now Canada appears to double every 2-3 days where as Australia is 3-4 days.




Friday, 20 March 2020

Two Week Forecast (March 20 - April 4)



Here I downloaded data from here. And made predictions for the next two weeks. Data was modelled in MATLAB via a fourth order polynomial.
Here we can see that Canada will overtake Australia and has about 12000 cases predicted in 2 weeks from now. Australia will have about 7700. The United States will pass 110,000 cases and globally expect upwards of 900,000 by the fourth of April. Beyond these two weeks is when policy changes such as social distancing can be seen and the slope of these lines will start to decrease. 

Monday, 16 March 2020

Short Term Predictions




It appears that the COVID-19 pandemic is unrelenting at the time of this writing. Today there are 170,000 infections globally. If I plot the data from the 17th of February until present day, the data fits a polynomial function of fourth order. I have not used data before this as around the 17th of February is when the number of cases around the world started to increase and cases in China started to plateau.


Over the next two weeks the total number of global cases is expected to exceed 700,000.  If I take the derivative of the curve and I can see the total number of new infections per day.


Currently there are about 10,000 new infections per day, however in just two weeks time there will be nearly 70,000 new infections per day.

I'll come back in two weeks to revisit this predictions.


Sunday, 15 March 2020

COVID-19 Analysis


With the ongoing SARS-CoV-2 virus on its current global pandemic of causing COVID-19 there is a lot of interesting data to be analysed. Getting a hold of the data isn't very easy, but I managed to get it from the worldometers website and using Wayback Machine I was able to get data for a few countries of interest. I have made some interesting plots below and it provides some details on the expected trajectory of the pandemic. 



This is a basic plot of selected countries and the total number of infections tracing back from today. Of interest here is that China and South Korea have very similar curves whereas the rest of the countries have clear exponential curves. I didn't fit the data in this post perhaps I will later when there is more data available for these countries of interest. Notice the "S" shaped curve for both the China and South Korea curves. This is a logistic curve, and we can see that the total number of infections has reached a plateau. China can do this because with its authoritarian government it can hold people against their will and force them into isolation.  South Korea seems to have made progress via mass scale testing alone unlike any other developed country. The other countries are either unable to have Chinese style control or are several weeks to months away from mass scale testing like South Korea.

Now take the derivative of the first set of data. This would represent the number of new infections per day. Notice how China and South Korea are both downtrending which is due to their control of the virus. Norway appears to be downtrending however there is due to just a couple of data points and it can go up. 



What if we take another derivative. Now this is the rate of change of the number of new infections. A negative value here implies the Ro value is below 1 and the virus spread is starting to become contained. This is also known as the inflection point which was covered in introductory calculus courses. Once the second derivative is below zero the virus will start to become contained. China and South Korea both have second derivatives below zero. It does not look promising for everyone else.


The last set of data I'll explore here is the number of infections per million. Here it looks like the United States is the best off country and they appear to be doing very well. The United States, Australia and Canada are all several weeks behind the other infected countries. I would expect Canada and Australia to have very similar numbers over the next few weeks owing to similar healthcare systems. Of all the countries mentioned here so far the USA is the biggest wild card. There has been more testing at the hospital where I work over the last month than there has been in the entire United States and the numbers for the United States are highly underreported.  Also the United States is the only country which is does not have universal healthcare and this could result in many people not be treated, which is something I have not heard about in any media post thus far.


Here there is a zoom out of the previous plot. Italy, Iran and Norway are some of the hardest hit countries. South Korea still has more cases per million, however the trajectory of Iran and Norway will surpass South Korea very soon.


I'll revisit this data over the next month or so and hopefully the world won't split into chaos and anarchy.

Now go wash your hands...

Wednesday, 12 February 2020

2019-nCOV Outbreak



Currently the 2019 novel coronavirus (2019-nCoV) is affecting mainland China along with 27 other countries and territories. First identified in Wuhan after 41 patients contracted pneumonia without a clear cause. I grabbed data from the Wiki page on the 2019-20 Wuhan coronavirus outbreak page. The data so far as of this blog post fits a quadratic model. Here I make (albeit nonsensical) predictions about the future of this outbreak. 

Here I have plotted data from January 16 2020 and the next four weeks which is the writing of this blog post. The data fits a simple ax^2 +bx + c quadratic. The red dots represent the actual data for the number infected in mainland China whereas the blue curves are the models. The hashed lines represent 10% errors from the model.



Here we can see how many would be infected at the end of 2020. The model predicts roughly 12 million infections. I've assumed nobody dies and that the rate of spread is consistent with current trends.


What if the virus was allowed to go unchallenged for a very long time. Well sometime in 2029 the entire population of China would be infected. I've once again assumed nobody dies from the virus and that the infection rate is constant from the first 4 weeks up until the next 10 years. To make the model slightly more accurate I have placed China's current and projected population as the red line. Even with this the model is going to be far from accurate, as this isn't Plague Inc where nobody cares about disease and doctors don't work. 

Sunday, 3 November 2019

Mostly Closest Planet


It has been over 3.5 years since I last updated this dead blog. But I was inspired by a recent video by CGP Grey on Youtube. Here I answer his question with simulation and using data. His video can be found here: https://www.youtube.com/watch?v=SumDHcnCRuU 

What is the closest planet to the Earth? It's a question with answers that vary from Venus or Mars. While true that these are the planets that get closest to the Earth with Venus getting as close as 38M km at opposition. Mars gets as close as 58M km at opposition. However what planet is closest to the Earth on average? The answer is surprisingly Mercury. This is due to the fact that Mercury is never too far from the Earth. When at conjunction both Mars and Venus can be at their semi-major axis plus the semi-major axis of the Earth apart. Whereas Mercury with its short semi-major axis and fast revolutionary period is never too far from the Earth.
Here is a plot of the inner planets. For simplicity sake assume all planetary orbits are circular with zero eccentricity and nil tilt in the orbital plane. Both of which are reasonable assumptions. Mercury does have a large precession due to the curvature of space in the Sun's vicinity however I'll ignore that here and it won't change the results dramatically. To plot these orbits I plotted them on the complex plane. The vertical axis is actually the complex component the function I used to plot the orbit. For each orbit I used the equation r*e^(2*pi*i/T), where r is the orbital semi major axis and T is the orbital period.  



Here I took the differences between the orbit of Earth and the other inner planets. Notice that the curves are not sinusoidal, this is because the distance changes rapidly when the planets are approaching opposition and when at conjunction the orbital distance changes slowly. The dashed line is the average distance between the planets. We can see here that Mercury has the closest average distance from the Earth. 

Unsurprisingly here is the distance between Mercury and the other inner planets. Earth does get closer than Venus to Mercury however there is nothing interesting here as the distance between the other planets and Mercury is as expected. 

 Likewise for Venus and the other planets Mercury is the closest


Now lets have a look at the outer planets. Here are the orbits for scale of the gas giants. 

Now we can plot the difference between the Jupiter and the other planets. Unsurprisingly Saturn, Uranus and Neptune follow each other in average distance from Jupiter. However the inner planets are hard to see at this scale. 

Zooming in and adjusting the time to a few years we can see that once again Mercury is on average the closest planet to Jupiter. Now lets see what happens if we look at Neptune. 
The time is now in centuries this is accounting for Neptune's 165 year orbital period. Uranus is the closest planet that ever makes an approach to Neptune when the two planets are at opposition. The higher frequency curves of the other planets is attributed to their faster orbital speeds and catching up and surpassing Neptune once in their orbit. Jupiter's curve has a period of roughly 10 years which is its orbital period. Zooming in to look at the inner planets we have the following.

Once again the average distance is depicted but the dashed line. It is hard to see here which planet is the closest to Neptune on average. However Mercury just edges out the others. Venus and Earth are not very far apart. The average distance between Mercury and Neptune is essentially its semi major axis. Which is almost the average distance between Venus and Neptune as well as Earth and Neptune. This means for planets which are sufficiently far away and as a result will have a slower orbital period the average distance is approximately the semi major axis of the orbit of the larger planet. To determine if Mercury is in fact the closest planet to Neptune on average a more robust simulation accounting for precession, orbital eccentricity and inclination should be done. I will not do such a calculation as such an endeavour is quite complex.