We will make charts from the NYT article on What Is the Real Coronavirus Death Toll in Each State?

Initially the charts used to look like these - deaths above or below normal Since then they have corrected/changed them to the following -

Sources of the datasets used -

OR

Whats the purpose of this visualization?

Comparing recent totals of deaths from all causes can provide a more complete picture of the pandemic’s impact than tracking only deaths of people with confirmed diagnoses. Epidemiologists refer to fatalities in the gap between the observed and normal numbers of deaths as “excess deaths.”

Indeed, in nearly every state with an unusual number of deaths in recent weeks, that number is higher than the state’s reported number of deaths from Covid-19. On our charts, we have marked the number of official coronavirus deaths with red lines, so you can see how they match up with the total number of excess deaths.

Measuring excess deaths is crude because it does not capture all the details of how people died. But many epidemiologists believe it is the best way to measure the impact of the virus in real time. It shows how the virus is altering normal patterns of mortality where it strikes and undermines arguments that it is merely killing vulnerable people who would have died anyway.

Public health researchers use such methods to measure the impact of catastrophic events when official measures of mortality are flawed.

Measuring excess deaths does not tell us precisely how each person died. It is likely that most of the excess deaths in this period are because of the coronavirus itself, given the dangerousness of the virus and the well-documented problems with testing. But it is also possible that deaths from other causes have risen too, as hospitals have become stressed and people have been scared to seek care for ailments that are typically survivable. Some causes of death may be declining, as people stay inside more, drive less and limit their contact with others.

First we chart the excess deaths. Excess deaths is calculated as the difference b/w all cause mortality data with average expected deaths for the week. These data are available from CDC as mentioned in the Sources section above.

import pandas as pd
import numpy as np
import altair as alt

alt.renderers.set_embed_options(actions=False)
RendererRegistry.enable('default')
uri = 'https://data.cdc.gov/api/views/xkkf-xrst/rows.csv?accessType=DOWNLOAD&bom=true&format=true%20target='
data = pd.read_csv(uri)
data.head()
Week Ending Date State Observed Number Upper Bound Threshold Exceeds Threshold Average Expected Count Excess Lower Estimate Excess Higher Estimate Year Total Excess Lower Estimate in 2020 Total Excess Higher Estimate in 2020 Percent Excess Lower Estimate Percent Excess Higher Estimate Type Outcome Suppress Note
0 2017-01-14 Alabama 1130.0 1188.0 False 1029.0 0.0 101.0 2017 3582 5579 0.0 0.1 Predicted (weighted) All causes NaN NaN
1 2017-01-21 Alabama 1048.0 1201.0 False 1042.0 0.0 6.0 2017 3582 5579 0.0 0.0 Predicted (weighted) All causes NaN NaN
2 2017-01-28 Alabama 1026.0 1216.0 False 1057.0 0.0 0.0 2017 3582 5579 0.0 0.0 Predicted (weighted) All causes NaN NaN
3 2017-02-04 Alabama 1036.0 1216.0 False 1057.0 0.0 0.0 2017 3582 5579 0.0 0.0 Predicted (weighted) All causes NaN NaN
4 2017-02-11 Alabama 1058.0 1207.0 False 1053.0 0.0 5.0 2017 3582 5579 0.0 0.0 Predicted (weighted) All causes NaN NaN

Extracting data for NYC for year 2020 -

nyc = data[(data['State'] == 'New York City') & (data['Year'] == 2020)]
nyc.Outcome.value_counts()
All causes                        82
All causes, excluding COVID-19    41
Name: Outcome, dtype: int64
nyc.Type.value_counts()
Predicted (weighted)    82
Unweighted              41
Name: Type, dtype: int64

We need the "All causes" "Predicted (weighted)" data. We can either filter it in pandas or do it from Altair itself. For now we are going with Altair.

Calculating excess deaths -

nyc = nyc.assign(excess = nyc['Observed Number'] - nyc['Average Expected Count'])
alt.Chart(nyc).mark_bar().transform_filter(alt.datum.Type=='Predicted (weighted)').transform_filter(alt.datum.Outcome=='All causes').encode(
    y='excess:Q',
    x='Week Ending Date:T'
)

Let's beautify it and color code the positive and negative numbers differently -

#collapse
bars = alt.Chart(nyc, height=600).mark_bar(width=9).transform_filter(alt.datum.Type=='Predicted (weighted)').transform_filter(alt.datum.Outcome=='All causes').encode(
    x=alt.X('Week Ending Date:T', title=None, axis=alt.Axis(grid=False, domain=False,format="%b")),
    y=alt.Y('excess:Q', title=None, axis=alt.Axis(domain=False, labelPadding=-50, position=-10, ticks=False, zindex=1, values=list(range(500,7500,500)))), 
    color = alt.condition(alt.datum.excess>0, alt.value('#ffab00'), alt.value('#8FB8BB'))
).properties(width=alt.Step(10)).configure_view(stroke=None)

bars

For the highlighted grey rectangle we make it like the following rectangle chart and then layer it behind our bar chart.

source = pd.DataFrame([{'start': '2020-03-15', 'end': '2020-10-10', 'y2': 7000, 'y': -100}])

rect = alt.Chart(source).mark_rect(opacity=1, fill='#eee', xOffset=5, x2Offset=5).encode(
    x='start:T',
    x2='end:T',
    y2='y2:Q',
    y='y:Q'
)

# collapse
bars = alt.Chart(nyc, height=1600, width=225).mark_bar(width=5).transform_filter(alt.datum.Type=='Predicted (weighted)').transform_filter(alt.datum.Outcome=='All causes').encode(
    x=alt.X('Week Ending Date:T', title=None, axis=alt.Axis(grid=False, offset=23, domain=False, format="%b", tickCount=4)),
    y=alt.Y('excess:Q', title=None, scale=alt.Scale(domain=[0, 7000]), axis=alt.Axis(domain=False, labelPadding=-25, position=-10, ticks=False, zindex=1, values=list(range(500,7500,500)))), 
    color = alt.condition(alt.datum.excess>0, alt.value('#ffab00'), alt.value('#8FB8BB'))
)

(rect+bars).configure_view(stroke=None)