Medical spending per beneficiary data and data description was obtained from Data.Medicare.gov. The data displayed here shows average spending levels during hospitals’ Medicare Spending per Beneficiary (MSPB) episodes. An MSPB episode includes all Medicare Part A and Part B claims paid during the period from 3 days prior to a hospital admission through 30 days after discharge. These average Medicare payment amounts have been price-standardized to remove the effect of geographic payment differences and add-on payments for indirect medical education (IME) and disproportionate share hospitals (DSH).
The data included a total of 3,065 hospitals from 50 states with 7 claim types at 3 periods. The average spending per episode was given by hospital, by state and by nation. First we import the data and rename some of the origional variables names using the
The aim of this post is to explore whether
- The total average spending per complete episode vary across states
- The average spending per episode vary across states for agiven claim type and period
The average spending across providers by period and claim type within states are explored in another blog.
# Load R li aries library(dplyr) library(ggplot2) library(stargazer) library(maps)
# Call the data and rename variable names mdClm <- read.csv(paste("../Data/Medicare Hospital Spending by Claim.csv", sep="")) mdClm <- rename(mdClm, AvgSpndgHsptl=Avg_Spending_Per_Episode_Hospital, AvgSpndgState=Avg_Spending_Per_Episode_State, AvgSpndgNation=Avg_Spending_Per_Episode_Nation, PcntSpndgHsptl=Percent_of_Spending_Hospital, PctSpndgState=Percent_of_Spending_State, PctSpndgNation=Percent_of_Spending_Nation) head(mdClm, n=3)
|1||HELEN KELLER HOSPITAL||10019||AL||During Index Hospital Admission||Skilled Nursing Facility||0||0||0||0.00%||0.00%||0.00%||01/01/2015||12/31/2015|
|2||HELEN KELLER HOSPITAL||10019||AL||During Index Hospital Admission||Durable Medical Equipment||18||31||24||0.10%||0.16%||0.12%||01/01/2015||12/31/2015|
|3||HELEN KELLER HOSPITAL||10019||AL||During Index Hospital Admission||Carrier||1062||1480||1540||6.01%||7.71%||7.52%||01/01/2015||12/31/2015|
To explore the total average spending per period and claim type, first agregate the data using the package/function
dplyr/select. Then generate a bar plot for each period and claim type combination. Regardless of states, the highest average spending was for inpatient claim during index hospital admission followed by skilled nursing facility 1 through 30 days after discharge from index hospital admission (see the bar plot below). There was no spending for the claim types Home Health Agency, Hospice, Outpatient and Skilled Nursing Facility during hospital admission, and for the hospice claim 1 to 3 days Prior to Index Hospital Admission.
# Aggregate the data by Period and Claim Type to explore where the most average spending is: periodClmType <- select(mdClm, Period, Claim_Type, AvgSpndgHsptl) %>% filter(Claim_Type != "Total") %>% group_by(Period, Claim_Type) %>% summarize(AvgSpndng = mean(AvgSpndgHsptl , na.rm = TRUE), MedSpndng = median(AvgSpndgHsptl , na.rm = TRUE)) # bar plot of average spending by Claimy Type and Period ggplot(periodClmType, aes(fill=Period, y=MedSpndng, x=Claim_Type)) + geom_bar(position="dodge", stat="identity") + theme(axis.text.x=element_text(size=7, angle=45, vjust=0.5, colour="black"), panel.background = element_rect(fill = "white", colour = "gray"), panel.grid.major = element_line(colour = "lightgray"), legend.position = "top", aspect.ratio=4/10) + xlab("Claim Type") + ylab("Average Spending")+ guides(fill=guide_legend(nrow=3,byrow=TRUE))
First call distinct state and average spending data to avoid duplicate data values; i.e. one average spending per state, using
dplyr/filer and distinct function or any other appropriate package/function.
# Select distinct state and average spending totalStateClm <- mdClm %>% filter(Claim_Type == "Total") %>% distinct(State, AvgSpndgState) %>% arrange(AvgSpndgState)
Then summarize the average spending data. I used the
stargazer package to generate a well formatted summary statistics. As shown in the table below, the range of total average spending was less than $4,000, and the inter-quartile range is about \$1,000. The total spending across states is shown in the US map that was filled with the total average spending.
# Summary statistis (Mean, Std, Median, Quartiles, Min and Max) totalStateClmSpndng <- data.frame(AvgSpending=totalStateClm$AvgSpndgState) stargazer(totalStateClmSpndng, type = "text", title="Summary of Medicare Average Spending per Episode State", digits=0, median=TRUE, iqr=TRUE, min.max=TRUE)
Summary of Medicare Average Spending per Episode State ===================================================================== Statistic N Mean St. Dev. Min Pctl(25) Median Pctl(75) Max --------------------------------------------------------------------- AvgSpending 50 20,233 818 18,991 19,661.5 20,062 20,707 22,432 ---------------------------------------------------------------------
# Call the data required to plot USA map (altitude, latitude, region / state) # And merge with medicare spending data map <- map_data("state") states <- read.csv(paste("../Data/states.csv", sep="")) dataClm <- merge(totalStateClm, states, by.x="State", by.y="Abbreviation") dataClm$state <- tolower(dataClm$state)# Convert all state chracter into lower case
# Plot total average spending across USA states m <- ggplot(dataClm, aes(fill = AvgSpndgState)) m + geom_map(aes(map_id = state), map = map) + expand_limits(x = map$long, y = map$lat) + ggtitle("Total average spending for complete episode across states")+ theme(axis.title=element_blank(), axis.text=element_blank(), axis.ticks=element_blank(), legend.position="right", legend.title=element_blank()) + coord_fixed(ratio = 1.75)
As shown in the map the total average spending was highest in Nevada, Texas and Utah, and lowest in West Virginia, New Mexico and Iowa.
cat("States with highest average spending") head(totalStateClm, n=5) cat("States with lowest average spending") tail(totalStateClm, n=5)
States with highest average spending
States with lowest average spending
Here I generated a bar plot for each state per period and claim type. There is a variation across states for most of period and claim types.
# Aggregate the data by Period and Claim Type to explore where the most average spending is: agrgtStPrdClm <- mdClm %>% select(State, Period, Claim_Type, AvgSpndgHsptl) %>% filter(Claim_Type != "Total") %>% group_by(State, Period, Claim_Type) %>% summarize(AvgSpndng = mean(AvgSpndgHsptl , na.rm = TRUE), MedSpndng = median(AvgSpndgHsptl , na.rm = TRUE)) ggplot(agrgtStPrdClm, aes(x=State, y=MedSpndng)) + geom_bar(aes(fill = Period), position = "dodge", stat="identity") + facet_wrap(~Claim_Type, nrow=7, scales = "free_y") + theme(legend.position = "top", axis.text.x=element_text(size=6, angle=45, vjust=0.5, colour="black")) + xlab("States") + ylab("Median Spending") + guides(fill=guide_legend(nrow=3,byrow=TRUE))
This post helps to visualize average medicare spending distribution and whether there were huge differences across states per complete episode as well as per period and claim type. As expected, there were huge differences due to period and claim type, but not across states for the complete episode.