Natral Experiments using R
Authors: Andri O. Gerber1, Stefan Durrer2
# Load importent libararies
library(tidyverse)
library(DataExplorer)
library(ggplot2)
library(psych)
library(Synth)
base_path <- "./data/"
file_name <- "andersson_2019.csv"
dat <- read.csv(file = paste0(base_path,file_name),
sep = ";", header = T)
head(dat, 2)
## Countryno country year CO2_transport_capita GDP_per_capita gas_cons_capita
## 1 1 Australia 1960 2.092157 12290.40 399.4560
## 2 1 Australia 1961 2.047124 12370.51 407.4215
## vehicles_capita urban_pop pop_density
## 1 265.7101 81.529 1.337682
## 2 275.6872 81.941 1.364565
str(dat)
## 'data.frame': 690 obs. of 9 variables:
## $ Countryno : int 1 1 1 1 1 1 1 1 1 1 ...
## $ country : chr "Australia" "Australia" "Australia" "Australia" ...
## $ year : int 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 ...
## $ CO2_transport_capita: num 2.09 2.05 2.09 2.23 2.29 ...
## $ GDP_per_capita : num 12290 12371 12829 13668 13974 ...
## $ gas_cons_capita : num 399 407 427 449 476 ...
## $ vehicles_capita : num 266 276 284 322 335 ...
## $ urban_pop : num 81.5 81.9 82.3 82.7 83.1 ...
## $ pop_density : num 1.34 1.36 1.4 1.43 1.45 ...
## 'data.frame': 690 obs. of 9 variables:
## $ Countryno : int 1 1 1 1 1 1 1 1 1 1 ...
## $ country : chr "Australia" "Australia" "Australia" "Australia" ...
## $ year : int 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 ...
## $ CO2_transport_capita: num 2.09 2.05 2.09 2.23 2.29 ...
## $ GDP_per_capita : num 12290 12371 12829 13668 13974 ...
## $ gas_cons_capita : num 399 407 427 449 476 ...
## $ vehicles_capita : num 266 276 284 322 335 ...
## $ urban_pop : num 81.5 81.9 82.3 82.7 83.1 ...
## $ pop_density : num 1.34 1.36 1.4 1.43 1.45 ...
## vars n mean sd median trimmed mad
## Countryno 1 690 8.00 4.32 8.00 8.00 5.93
## country* 2 690 8.00 4.32 8.00 8.00 5.93
## year 3 690 1982.50 13.29 1982.50 1982.50 17.05
## CO2_transport_capita 4 690 2.07 1.34 1.81 1.88 0.95
## GDP_per_capita 5 680 19291.11 8337.65 18672.16 19074.90 9084.41
## gas_cons_capita 6 690 399.45 314.14 304.24 351.80 247.41
## vehicles_capita 7 690 374.81 193.64 379.94 376.36 213.39
## urban_pop 8 690 74.62 13.12 76.05 75.63 11.50
## pop_density 9 690 95.83 97.39 79.39 79.83 87.60
## min max range skew kurtosis se
## Countryno 1.00 15.00 14.00 0.00 -1.22 0.16
## country* 1.00 15.00 14.00 0.00 -1.22 0.16
## year 1960.00 2005.00 45.00 0.00 -1.21 0.51
## CO2_transport_capita 0.20 6.06 5.86 1.23 1.04 0.05
## GDP_per_capita 3656.60 43212.31 39555.71 0.26 -0.57 319.73
## gas_cons_capita 17.40 1405.09 1387.69 1.27 1.00 11.96
## vehicles_capita 8.03 825.02 816.98 -0.05 -0.67 7.37
## urban_pop 34.95 97.40 62.44 -0.78 0.40 0.50
## pop_density 1.34 350.54 349.21 1.18 0.50 3.71
DataExplorer::plot_missing(dat)
The whole data set consists of 690 observations from 15 European countries, covering the time span from 1960 to 2005. Column 1 represents a country code, column 2 the actual name of the country. Column 3 represents the year of the observation. Column 5 represents the observed carbon dioxide emissions per capita in the given year. This is used as the dependent variable in this analysis. Column 6 to 9 are used as predictors.
Firstly, the whole data set is inspected as a whole. It is apparent, that for the GDP predictor, 10 entries are missing. These can all be attributed to Poland, from the years 1960 to 1969. Other than that, no missings were identified.
library(dplyr)
library(knitr)
library(kableExtra)
library(gridExtra)
desc_stats <- psych::describe(dat)
desc_stats_rounded <- desc_stats
desc_stats_rounded[] <- lapply(desc_stats_rounded, function(x) if(is.numeric(x)) round(x, 2) else x)
kable(desc_stats_rounded, format = "html", caption = "Descriptive Statistics (Rounded to 2 Decimal Points)") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"), full_width = F) %>%
column_spec(1, width = "3em") %>%
column_spec(2:NCOL(desc_stats_rounded), width = "5em")
vars | n | mean | sd | median | trimmed | mad | min | max | range | skew | kurtosis | se | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Countryno | 1 | 690 | 8.00 | 4.32 | 8.00 | 8.00 | 5.93 | 1.00 | 15.00 | 14.00 | 0.00 | -1.22 | 0.16 |
country* | 2 | 690 | 8.00 | 4.32 | 8.00 | 8.00 | 5.93 | 1.00 | 15.00 | 14.00 | 0.00 | -1.22 | 0.16 |
year | 3 | 690 | 1982.50 | 13.29 | 1982.50 | 1982.50 | 17.05 | 1960.00 | 2005.00 | 45.00 | 0.00 | -1.21 | 0.51 |
CO2_transport_capita | 4 | 690 | 2.07 | 1.34 | 1.81 | 1.88 | 0.95 | 0.20 | 6.06 | 5.86 | 1.23 | 1.04 | 0.05 |
GDP_per_capita | 5 | 680 | 19291.11 | 8337.65 | 18672.16 | 19074.90 | 9084.41 | 3656.60 | 43212.31 | 39555.71 | 0.26 | -0.57 | 319.73 |
gas_cons_capita | 6 | 690 | 399.45 | 314.14 | 304.24 | 351.80 | 247.41 | 17.40 | 1405.09 | 1387.69 | 1.27 | 1.00 | 11.96 |
vehicles_capita | 7 | 690 | 374.81 | 193.64 | 379.94 | 376.36 | 213.39 | 8.03 | 825.02 | 816.98 | -0.05 | -0.67 | 7.37 |
urban_pop | 8 | 690 | 74.62 | 13.12 | 76.05 | 75.63 | 11.50 | 34.95 | 97.40 | 62.44 | -0.78 | 0.40 | 0.50 |
pop_density | 9 | 690 | 95.83 | 97.39 | 79.39 | 79.83 | 87.60 | 1.34 | 350.54 | 349.21 | 1.18 | 0.50 | 3.71 |
This first plot shows an overview of the GDP predictor. The brown line indicates the evolution of Sweden’s GDP in US-Dollars, whereas the blue line represents the mean of all OECD countries. It is apparent that the GDP more or less triples in the observed time frame of 45 years.
# Calculate the mean GDP per capita for all countries except Sweden
mean_gdp <- dat %>%
filter(country != "Sweden") %>%
group_by(year) %>%
summarize(mean_GDP = mean(GDP_per_capita, na.rm = TRUE))
# Create the plot
gdp <- ggplot(dat, aes(x = year, y = GDP_per_capita)) +
# Lines for all countries except Sweden
geom_line(data = subset(dat, country != "Sweden"), aes(group = country), color = "bisque3", linetype = "dashed") +
# Line for Sweden
geom_line(data = subset(dat, country == "Sweden"), color = "brown", linewidth = 1.5) +
# Line for the mean of all countries except Sweden
geom_line(data = mean_gdp, aes(x = year, y = mean_GDP), color = "blue", linewidth = 1.2) +
# Vertical lines for 1970, 1980, and 1989
geom_vline(xintercept = c(1970, 1980, 1989), color = "red", linetype = "solid") +
labs(
title = "GDP per capita over time",
x = "Year",
y = "GDP per capita (USD)"
) +
theme_minimal(base_size = 15) +
theme(
panel.background = element_rect(fill = "white", color = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
plot.background = element_rect(color = "black", fill = NA)
)
gdp
The carbon dioxide emissions in kilotons per capita are shown below. The drop after 1980 indicates that the world’s energy crisis, which happened during the years between 1973 to 1979, clearly made an impact on the world’s energy consumption. Another drop is seen after 1991 for Sweden, whereas the mean emission level of the other OECD countries continued to rise. This drop is further investigated in this analysis, since this marks the year where Sweden implemented it’s carbon taxation reform.
# Calculate the mean emissions for all countries except Sweden
mean_emissions <- dat %>%
filter(country != "Sweden") %>%
group_by(year) %>%
summarize(mean_CO2 = mean(CO2_transport_capita, na.rm = TRUE))
# Create the plot
emissions <- ggplot(dat, aes(x = year, y = CO2_transport_capita)) +
# Lines for all countries except Sweden
geom_line(data = subset(dat, country != "Sweden"), aes(group = country), color = "bisque3", linetype = "dashed") +
# Line for Sweden
geom_line(data = subset(dat, country == "Sweden"), color = "brown", linewidth = 1.5) +
# Line for the mean of all countries except Sweden
geom_line(data = mean_emissions, aes(x = year, y = mean_CO2), color = "blue", linewidth = 1.2) +
# Vertical lines for 1970, 1980, and 1989
geom_vline(xintercept = c(1970, 1980, 1989), color = "red", linetype = "solid") +
labs(
title = expression(CO[2]~"emissions over time"), # Subscript in title
x = "Year",
y = expression(CO[2]~"emissions (kt)") # Subscript in y-axis label
) +
theme_minimal(base_size = 15) +
theme(
panel.background = element_rect(fill = "white", color = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
plot.background = element_rect(color = "black", fill = NA)
)
emissions
The amount of fossil fuel powered vehicles per 1000 capita tends to continuously increase for the average OECD country, whereas for Sweden, the amount of vehicles stagnates after the implementation of the carbon taxation in 1991. It is assumed that Sweden’s population started to use alternatives for transportation to avoid higher prices for fossil fuels. Comparable to the GDP, the amount of vehicles per 1000 capita triples in the years from 1960 to 2005.
# Calculate the mean vehicles per capita for all countries except Sweden
mean_vehicles <- dat %>%
filter(country != "Sweden") %>%
group_by(year) %>%
summarize(mean_vehicles = mean(vehicles_capita, na.rm = TRUE))
# Create the plot
vehicles <- ggplot(dat, aes(x = year, y = vehicles_capita)) +
# Lines for all countries except Sweden
geom_line(data = subset(dat, country != "Sweden"), aes(group = country), color = "bisque3", linetype = "dashed") +
# Line for Sweden
geom_line(data = subset(dat, country == "Sweden"), color = "brown", linewidth = 1.5) +
# Line for the mean of all countries except Sweden
geom_line(data = mean_vehicles, aes(x = year, y = mean_vehicles), color = "blue", linewidth = 1.2) +
# Vertical lines for 1970, 1980, and 1989
geom_vline(xintercept = c(1970, 1980, 1989), color = "red", linetype = "solid") +
labs(
title = "Vehicles per 1000 capita over time",
x = "Year",
y = "Vehicles per 1000 capita"
) +
theme_minimal(base_size = 15) +
theme(
panel.background = element_rect(fill = "white", color = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
plot.background = element_rect(color = "black", fill = NA)
)
vehicles
In the following plot, the amount of urban population is shown for each country. With about 80% from 1970 onwards, Sweden has a highly urbanized population, which is higher than in an average OECD country, where a mean of 74.6% is observed. The highest percentage of urbanization can be attributed to Belgium, with 97%, whereas the lowest urbanization level is observed in Portugal, with 34%.
mean_urban <- dat %>%
filter(country != "Sweden") %>%
group_by(year) %>%
summarize(mean_urban_pop = mean(urban_pop, na.rm = TRUE))
# Create the plot
urban <- ggplot(dat, aes(x = year, y = urban_pop)) +
# Lines for all countries except Sweden
geom_line(data = subset(dat, country != "Sweden"), aes(group = country), color = "bisque3", linetype = "dashed") +
# Line for Sweden
geom_line(data = subset(dat, country == "Sweden"), color = "brown", linewidth = 1.5) +
# Line for the mean of all countries except Sweden
geom_line(data = mean_urban, aes(x = year, y = mean_urban_pop), color = "blue", linewidth = 1.2) +
# Vertical lines for 1970, 1980, and 1989
geom_vline(xintercept = c(1970, 1980, 1989), color = "red", linetype = "solid") +
labs(
title = "Urban population over time",
x = "Year",
y = "Urban population in %"
) +
theme_minimal(base_size = 15) +
theme(
panel.background = element_rect(fill = "white", color = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
plot.background = element_rect(color = "black", fill = NA)
)
urban
Lastly, the gasoline consumption per capita in oil equivalents is explored. Here, a significant drop in gasoline consumption can be observed for Sweden, after the taxation got implemented. The average of the OECD Countries does not show such a prominent decline in consumption of fossil fuels. Nevertheless, the fuel consumption in transportation seems to have only slightly risen and stagnated since 1980.
# Calculate the mean gasoline consumption per capita for all countries except Sweden
mean_gasoline <- dat %>%
filter(country != "Sweden") %>%
group_by(year) %>%
summarize(mean_gas_cons = mean(gas_cons_capita, na.rm = TRUE))
# Create the plot
gasoline <- ggplot(dat, aes(x = year, y = gas_cons_capita)) +
# Lines for all countries except Sweden
geom_line(data = subset(dat, country != "Sweden"), aes(group = country), color = "bisque3", linetype = "dashed") +
# Line for Sweden
geom_line(data = subset(dat, country == "Sweden"), color = "brown", linewidth = 1.5) +
# Line for the mean of all countries except Sweden
geom_line(data = mean_gasoline, aes(x = year, y = mean_gas_cons), color = "blue", linewidth = 1.2) +
# Vertical lines for 1970, 1980, and 1989
geom_vline(xintercept = c(1970, 1980, 1989), color = "red", linetype = "solid") +
labs(
title = "Gasoline consumption per capita over time",
x = "Year",
y = "Gasoline consumption in kg of oil equivalent"
) +
theme_minimal(base_size = 15) +
theme(
panel.background = element_rect(fill = "white", color = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
plot.background = element_rect(color = "black", fill = NA)
)
gasoline
# load package
library("Synth")
attach(dat)
# Data preparation for synthetic control method
dataprep.out <- dataprep(
foo = dat, # Input dataset
# Select predictors
predictors = c(
"GDP_per_capita",
"gas_cons_capita",
"vehicles_capita",
"urban_pop"
),
predictors.op = "mean", # Use the mean of each predictor during the pre-treatment period
time.predictors.prior = 1980:1989, # Time period before treatment (pre-1990)
# Special predictors: Specific years of CO2 emissions to ensure the pre-treatment match
special.predictors = list( # Average CO2 emissions:
list("CO2_transport_capita", 1989, "mean"), # Emissions right before the taxation
list("CO2_transport_capita", 1980, "mean"), # Environmental policies from 1970 impact
list("CO2_transport_capita", 1970, "mean") # Start of world’s energy crisis
),
dependent = "CO2_transport_capita", # Dependent variable (outcome of interest)
unit.variable = "Countryno", # Variable that identifies the units (countries)
unit.names.variable = "country", # Variable that contains country names
time.variable = "year", # Time variable indicating the year
treatment.identifier = 13, # Sweden
controls.identifier = c(1:12, 14:15), # The control units (donor states)
time.optimize.ssr = 1960:1989, # Time periods used to optimize the fit between Sweden and synthetic control
time.plot = 1960:2005 # Time period for plotting (pre- and post-treatment)
)
# Running the synthetic control method
synth.out <- synth(data.prep.obj = dataprep.out,
method = "All")
##
## X1, X0, Z1, Z0 all come directly from dataprep object.
##
##
## ****************
## searching for synthetic control unit
##
##
## ****************
## ****************
## ****************
##
## MSPE (LOSS V): 0.001222443
##
## solution.v:
## 0.2188493 0.009719276 0.07826532 0.2126716 0.1832151 0.2838626 0.01341685
##
## solution.w:
## 0.0007695733 0.1954709 5.61912e-05 0.3842992 0.0003365576 0.09042864 0.001226448 0.0003818197 0.1765464 0.0006891624 0.0003987502 0.0004105377 0.06109071 0.08789508
# Generating summary tables of the results
synth.tables <- synth.tab(dataprep.res = dataprep.out,
synth.res = synth.out
)
Table 1 has been replicated from Andersson et al. (2019). All attributed weights by the synthetic control method were replicated identically. The special predictor time.optimize.ssr is not further specified by Andersson. Without any information, a time frame from 1960 to 1989 is used, to cover the whole period before the treatment takes place.
It appears that Andersson interchanged rows 2 and 3 of Table 1. In the original data set, the predictor Gasoline Consumption per Capita is listed before Motor vehicles (per 1,000 people). Therefore, it is assumed that the corresponding values for synthetic Sweden have not been switched correctly and can be considered as an artifact of the alternation of these rows. In the replication, the value for the predictor Motor vehicles (per 1’000 people is 406.8, which matches Andersson’s value for Gasoline consumption per Capita. The same applies for the replicated value 406.2 for Gasoline Consumption per Capita, which matches Anderssons value for Motor vehicles (per 1’000 people).
Even though the weights for each country as well as the predictors match Andersson’s approach, column 3 OECD Sample could not be matched correctly. The values for the replicated OECD Sample are considerably lower than the values Andersson specified. This can be attributed to the missing information about the special predictor time.optimization.ssr. Trial and Error with different time frames unfortunately did not yield a close replication of Andersson’s results. Since Andersson clearly specified the settings for each other predictor, they were not changed to obtain a better result for column 3 in Table 1.
library(knitr)
###################################################
### Table 1: CO2 Emissions From Transport Predictor Means Before Tax Reform
###################################################
synth.tables$tab.pred[1:7, ]
## Treated Synthetic Sample Mean
## GDP_per_capita 20121.479 20121.207 18706.663
## gas_cons_capita 456.178 406.220 425.510
## vehicles_capita 405.562 406.773 402.368
## urban_pop 83.099 83.082 75.238
## special.CO2_transport_capita.1989 2.506 2.478 2.371
## special.CO2_transport_capita.1980 2.005 2.033 2.121
## special.CO2_transport_capita.1970 1.729 1.672 1.637
# Re-Style the table such that it matches the one from the paper
# Changing column names
colnames(synth.tables$tab.pred)[colnames(synth.tables$tab.pred) %in%
c("Treated", "Synthetic", "Sample Mean")] <-
c("Sweden", "Synth. Sweden", "OECD Sample")
# interchanging variables and renaming them
synth.tables$tab.pred <- synth.tables$tab.pred[c(1, 3, 2, 4:7), ]
rownames(synth.tables$tab.pred)[1:7] <-
c("GDP per Capita",
"Motor vehicles (per 1,000 poeople)",
"Gasoline Consumption per Capita",
"Urban population",
"CO 2 from transport per capita 1989",
"CO 2 from transport per capita 1980",
"CO 2 from transport per capita 1970")
# round all values to .1
synth.tables$tab.pred <- round(synth.tables$tab.pred, 1)
kable(synth.tables$tab.pred[1:7, ],
caption = "Table 1: CO2 Emissions From Transport Predictor Means Before Tax Reform",
format = "markdown")
Sweden | Synth. Sweden | OECD Sample | |
---|---|---|---|
GDP per Capita | 20121.5 | 20121.2 | 18706.7 |
Motor vehicles (per 1,000 poeople) | 405.6 | 406.8 | 402.4 |
Gasoline Consumption per Capita | 456.2 | 406.2 | 425.5 |
Urban population | 83.1 | 83.1 | 75.2 |
CO 2 from transport per capita 1989 | 2.5 | 2.5 | 2.4 |
CO 2 from transport per capita 1980 | 2.0 | 2.0 | 2.1 |
CO 2 from transport per capita 1970 | 1.7 | 1.7 | 1.6 |
# Table 1: CO2 Emissions From Transport Predictor Means Before Tax Reform
round(synth.tables$tab.pred[1:7, ],1)
## Sweden Synth. Sweden OECD Sample
## GDP per Capita 20121.5 20121.2 18706.7
## Motor vehicles (per 1,000 poeople) 405.6 406.8 402.4
## Gasoline Consumption per Capita 456.2 406.2 425.5
## Urban population 83.1 83.1 75.2
## CO 2 from transport per capita 1989 2.5 2.5 2.4
## CO 2 from transport per capita 1980 2.0 2.0 2.1
## CO 2 from transport per capita 1970 1.7 1.7 1.6
We have replicated Figure 4 from Andersson et al. (2019), which compares per capita \(CO_2\) emissions from transport in Sweden against a constructed Synthetic Sweden using data from OECD countries that did not implement carbon taxes.
# Figure 4: Path Plot of per capita CO2 Emissions from Transport
path.plot(
synth.res = synth.out, # Synthetic control results
dataprep.res = dataprep.out, # Data prepared for plotting
Ylab = "Metric tons per capita (CO2 from transport)",
Xlab = "Year",
Ylim = c(0,3), # Set y-axis limits (CO2 emissions in metric tons)
Legend = c("Sweden", "synthetic Sweden"), # Labels
Legend.position = "bottomright" # Position of the legend
)
# Vertical dotted line at 1990 to indicate the introduction of the carbon tax and VAT
abline(v=1990, lty="dotted", lwd=2)
# Adding an arrow and text label to mark the carbon tax and VAT introduction
arrows(1987, 1.0, 1989, 1.0, col="black", length=.1) # Arrow indicating year 1990
Cex.set <- 1 # Text size for annotation
text(1981, 1.0, "VAT + Carbon tax", cex=Cex.set) # Add text annotation on the plot
The plot shows that before 1990, actual \(CO_2\) emissions in Sweden closely track those of Synthetic Sweden, indicating a good fit. This suggests that the synthetic control model accurately captures what Sweden’s emissions would have looked like without the carbon tax, validating the model’s pre-treatment performance.
Post-1990, following the introduction of the carbon tax in 1991, we observe a clear divergence: Sweden’s emissions begin to decline significantly, while emissions in Synthetic Sweden remain stable or increase slightly. This confirms the causal effect of the carbon tax in reducing \(CO_2\) emissions from transport in Sweden.
Andersson et al. (2019) conducted several robustness checks in his study, including leave-one-out, alternative predictor combinations, and weighting variations. We focus on placebo tests because they validate that Sweden’s \(CO_2\) reduction isn’t random, allow comparison with non-treated countries, provide clear visual validation, and are a standard robustness check in synthetic control studies.
- Placebo Tests: Applied to non-treated countries to ensure results are not random. - Leave-One-Out Test: Excluded one control country at a time to check if results are driven by a single country. - Alternative Predictor Combinations: Tested various combinations of lagged \(CO_2\) and GDP predictors for consistency. - Weighting Variations: Adjusted predictor weights (e.g., GDP, vehicles) for robustness across different configurations. - Regression Analysis: Checked the impact of GDP, unemployment, and fuel taxes to confirm the carbon tax’s role.
Now, we run placebo tests in-time for 1980 and 1970 to simulate an earlier carbon tax application. These tests check whether the synthetic control method still detects a significant treatment effect before the actual 1990 implementation (as shown in Figure 6: Placebo in-time tests 1980/1970). The aim is to verify that no significant treatment effect is detected before the real tax introduction in 1990. If a significant reduction in emissions is observed in these placebo tests, it would cast doubt on the robustness of the real treatment effect.
# Note: Main modification for placebo: Pre-treatment Years and Optimization Method (BFGS for efficiency)
# Placebo test in time for 1980
dataprep.out <-
dataprep(foo = dat,
predictors = c("GDP_per_capita" , "gas_cons_capita" , "vehicles_capita" ,
"urban_pop") ,
predictors.op = "mean" ,
# Pre-treatment period (prior to the placebo year 1980)
time.predictors.prior = 1970:1979 ,
special.predictors = list(
list("CO2_transport_capita" , 1979 , "mean"),
list("CO2_transport_capita" , 1970 , "mean"),
list("CO2_transport_capita" , 1965 , "mean")
),
dependent = "CO2_transport_capita",
unit.variable = "Countryno",
unit.names.variable = "country",
time.variable = "year",
# Sweden is treated as if the tax was introduced in 1980 (placebo intervention year)
treatment.identifier = 13,
controls.identifier = c(1:12,14:15),
# Optimize fit over the pre-treatment period (1960-1979)
time.optimize.ssr = 1960:1979,
time.plot = 1960:1990
)
synth.out <- synth(
data.prep.obj = dataprep.out,
method = "BFGS" # Optimization method for efficiency
)
##
## X1, X0, Z1, Z0 all come directly from dataprep object.
##
##
## ****************
## searching for synthetic control unit
##
##
## ****************
## ****************
## ****************
##
## MSPE (LOSS V): 0.001500508
##
## solution.v:
## 0.1562492 0.5783013 0.01223948 0.02842847 0.1830067 0.04177432 4.254e-07
##
## solution.w:
## 0.04915065 0.1607624 0.00286956 0.5193108 0.001615248 0.003992957 0.0009714243 0.01298892 0.1447179 0.03245229 0.004764193 0.007168443 0.001224774 0.05801042
path.plot(synth.res = synth.out,
dataprep.res = dataprep.out,
Ylab = "Metric tons per capita (CO2 from transport)",
Xlab = "Year",
Ylim = c(0,3),
Legend = c("Sweden","synthetic Sweden"),
Legend.position = "bottomright"
)
# Add line
abline(v=1980,lty="dotted",lwd=2)
arrows(1977,1.0,1979,1.0,col="black",length=.1)
Cex.set <- 1
text(1974,1.0,"Placebo tax",cex=Cex.set)
# Placebo test in time for 1970 (excluding Poland due to missing GDP data 1960-69)
dataprep.out <-
dataprep(foo = dat,
predictors = c("GDP_per_capita" , "gas_cons_capita" , "vehicles_capita" ,
"urban_pop") ,
predictors.op = "mean" ,
# Pre-treatment period (before 1970, for the placebo test)
time.predictors.prior = 1960:1969 ,
special.predictors = list(
list("CO2_transport_capita" , 1960:1970 , "mean")
),
dependent = "CO2_transport_capita",
unit.variable = "Countryno",
unit.names.variable = "country",
time.variable = "year",
# Sweden is the treated unit for this placebo test (as if the tax was introduced in 1970)
treatment.identifier = 13,
# Control units (excluding Poland due to missing data for 1960-69)
controls.identifier = c(1:9, 11:12, 14:15),
# Optimize fit over the pre-treatment period (1960-1969)
time.optimize.ssr = 1960:1969,
time.plot = 1960:1990
)
synth.out <- synth(
data.prep.obj = dataprep.out,
method = "All" # Use multiple optimization methods and choose the best one
)
##
## X1, X0, Z1, Z0 all come directly from dataprep object.
##
##
## ****************
## searching for synthetic control unit
##
##
## ****************
## ****************
## ****************
##
## MSPE (LOSS V): 0.001084686
##
## solution.v:
## 0.3688566 0.3596506 0.1663558 0.005073374 0.1000635
##
## solution.w:
## 0.1965028 0.05889596 0.001715361 0.3885827 0.1689035 0.002543625 0.005559365 0.002896784 0.101867 0.002251227 0.002979535 0.06649599 0.0008062062
path.plot(synth.res = synth.out,
dataprep.res = dataprep.out,
Ylab = "Metric tons per capita (CO2 from transport)",
Xlab = "Year",
Ylim = c(0,3),
Legend = c("Sweden","synthetic Sweden"),
Legend.position = "bottomright"
)
# Add line
abline(v=1970,lty="dotted",lwd=2)
arrows(1968,2.0,1969.5,2.0,col="black",length=.1)
Cex.set <- 1
text(1965,2.0,"Placebo tax",cex=Cex.set)
In both tests, synthetic Sweden closely follows actual Sweden, with no significant drop in emissions during the placebo periods. This confirms that the post-1990 \(CO_2\) reduction was not driven by pre-existing trends or random fluctuations, but by the introduction of the carbon tax.
The lack of a significant treatment effect in the placebo years validates the robustness of the real emission reductions observed after 1990, linking them directly to the carbon tax. This ensures that the reduction is not coincidental but a clear result of the policy.
Next, we conduct placebo tests in-space, where each control country is treated as if it had implemented the carbon tax. This comparison helps to evaluate whether Sweden’s observed \(CO_2\) reduction is larger than those seen by chance in non-treated countries. If Sweden’s effect is significantly larger, it strengthens the validity of the carbon tax effect (see Figure 7: Placebo in-space tests). These tests help confirm that the treatment effect seen in Sweden is robust and not a random outcome.
# Figure 7: Placebo in-space tests
# Initialize an empty matrix to store the gaps for all countries
store <- matrix(NA, length(1960:2005), 15)
colnames(store) <- unique(dat$country) # Assign country names as column names for the matrix
# Run placebo test for each country (control units) by treating them as if they were Sweden
for (iter in 1:15) {
# Prepare data for synthetic control method, treating each country as the "treated" unit (as before for Sweden)
dataprep.out <- dataprep(
foo = dat,
predictors = c("GDP_per_capita", "gas_cons_capita", "vehicles_capita", "urban_pop"),
predictors.op = "mean",
time.predictors.prior = 1980:1989, # Pre-treatment period
special.predictors = list(
list("CO2_transport_capita", 1989, "mean"), # Emissions right before the taxation
list("CO2_transport_capita", 1980, "mean"), # Environmental policies from 1970 impact
list("CO2_transport_capita", 1970, "mean") # Start of world’s energy crisis
),
dependent = "CO2_transport_capita",
unit.variable = "Countryno",
unit.names.variable = "country",
time.variable = "year",
treatment.identifier = iter, # Iteratively treat each country as if it had Sweden’s carbon tax
controls.identifier = c(1:15)[-iter], # All other countries as controls
time.optimize.ssr = 1960:1989,
time.plot = 1960:2005
)
synth.out <- synth(
data.prep.obj = dataprep.out,
method = "BFGS" # Optimization method for efficiency
)
# Calculate and store the gap (difference between actual and synthetic outcomes) for each country
store[, iter] <- dataprep.out$Y1plot - (dataprep.out$Y0plot %*% synth.out$solution.w)
}
##
## X1, X0, Z1, Z0 all come directly from dataprep object.
##
##
## ****************
## searching for synthetic control unit
##
##
## ****************
## ****************
## ****************
##
## MSPE (LOSS V): 0.02050707
##
## solution.v:
## 9.3128e-06 0.0923376 0.1222009 0.1665115 0.08936357 0.001468168 0.528109
##
## solution.w:
## 0.1620515 0.1736771 3.567e-07 1.705e-07 2.234e-07 0.4338658 1.858e-07 3.036e-07 1.522e-07 1.16e-07 2.091e-07 5.334e-07 1.504e-07 0.2304032
##
##
## X1, X0, Z1, Z0 all come directly from dataprep object.
##
##
## ****************
## searching for synthetic control unit
##
##
## ****************
## ****************
## ****************
##
## MSPE (LOSS V): 0.004687469
##
## solution.v:
## 0.1786289 0.2452266 0.1751274 2.6747e-06 0.05897566 0.1804502 0.1615886
##
## solution.w:
## 0.0008232382 0.0007222787 0.4163812 0.2735356 2.2884e-06 0.07497473 0.04407866 4.9257e-06 0.0003196672 0.0004211639 0.1875344 0.0006955736 4.95e-08 0.0005062649
##
##
## X1, X0, Z1, Z0 all come directly from dataprep object.
##
##
## ****************
## searching for synthetic control unit
##
##
## ****************
## ****************
## ****************
##
## MSPE (LOSS V): 0.06083566
##
## solution.v:
## 0.02030858 0.0858473 0.001873214 0.04612575 0.2590561 0.173567 0.4132221
##
## solution.w:
## 0.4131704 9.12e-08 2.827e-07 1.1334e-06 7.052e-07 2.951e-07 7.607e-07 2.646e-07 1.2644e-06 0.02209183 8.978e-07 2.453e-07 6.003e-07 0.5647312
##
##
## X1, X0, Z1, Z0 all come directly from dataprep object.
##
##
## ****************
## searching for synthetic control unit
##
##
## ****************
## ****************
## ****************
##
## MSPE (LOSS V): 0.007688681
##
## solution.v:
## 0.1246051 1.0164e-06 1.0216e-06 0.1464709 0.2170061 0.2492796 0.2626362
##
## solution.w:
## 8.37328e-05 0.3810731 6.28352e-05 0.0001366259 9.85466e-05 6.13448e-05 0.0002588917 0.0001395473 0.01887912 0.0001402585 0.000107282 0.4239469 0.1748305 0.0001813468
##
##
## X1, X0, Z1, Z0 all come directly from dataprep object.
##
##
## ****************
## searching for synthetic control unit
##
##
## ****************
## ****************
## ****************
##
## MSPE (LOSS V): 0.001882147
##
## solution.v:
## 0.1646675 0.4494074 0.06010378 0.00269735 0.2196895 0.03733842 0.06609608
##
## solution.w:
## 7.21474e-05 2.56016e-05 5.13796e-05 2.64834e-05 4.3381e-05 0.1565694 0.5954916 0.1532704 1.97969e-05 0.01088114 0.0001201818 4.24361e-05 0.08333645 4.96542e-05
##
##
## X1, X0, Z1, Z0 all come directly from dataprep object.
##
##
## ****************
## searching for synthetic control unit
##
##
## ****************
## ****************
## ****************
##
## MSPE (LOSS V): 0.01922805
##
## solution.v:
## 0.08295382 0.3454311 0.002240606 1.0337e-06 0.04102503 0.06933517 0.4590133
##
## solution.w:
## 0.0002543148 3.0399e-06 5.16028e-05 2.6747e-06 1.56208e-05 3.61e-08 1.081e-06 0.0004376103 4.3e-09 0.6141264 0.2726892 4.68527e-05 0.1123576 1.40317e-05
##
##
## X1, X0, Z1, Z0 all come directly from dataprep object.
##
##
## ****************
## searching for synthetic control unit
##
##
## ****************
## ****************
## ****************
##
## MSPE (LOSS V): 0.03539263
##
## solution.v:
## 0.03784595 0.000768615 0.2954628 0.3793174 1.25625e-05 4.3152e-06 0.2865884
##
## solution.w:
## 0.187154 0.4626867 3.9598e-06 7.1117e-06 2.63525e-05 3.1566e-06 1.0305e-05 0.289258 2.2749e-06 3.3937e-06 3.8578e-06 8.327e-06 0.06081844 1.42189e-05
##
##
## X1, X0, Z1, Z0 all come directly from dataprep object.
##
##
## ****************
## searching for synthetic control unit
##
##
## ****************
## ****************
## ****************
##
## MSPE (LOSS V): 0.008483476
##
## solution.v:
## 0.003527111 0.4440337 0.008294592 0.1980022 0.06134266 0.1240346 0.1607651
##
## solution.w:
## 0.0001788944 0.5245748 0.0001693645 0.0004462959 0.1695207 1.00429e-05 0.000420551 0.0002474942 3.27902e-05 0.3036448 2.84654e-05 0.000236314 0.0003594166 0.0001300287
##
##
## X1, X0, Z1, Z0 all come directly from dataprep object.
##
##
## ****************
## searching for synthetic control unit
##
##
## ****************
## ****************
## ****************
##
## MSPE (LOSS V): 0.003938095
##
## solution.v:
## 0.06037982 0.006379539 0.001403702 0.001306186 0.2544515 0.2201463 0.4559329
##
## solution.w:
## 0.123983 0.0006357359 1.6435e-06 0.000121653 7.36865e-05 2.70586e-05 6.68011e-05 7.47572e-05 1.69e-08 2.32501e-05 0.560372 0.1703673 5.83153e-05 0.1441948
##
##
## X1, X0, Z1, Z0 all come directly from dataprep object.
##
##
## ****************
## searching for synthetic control unit
##
##
## ****************
## ****************
## ****************
##
## MSPE (LOSS V): 0.09916474
##
## solution.v:
## 0.01743644 0.09699938 0.009606564 3.03911e-05 0.08555013 0.1961192 0.5942579
##
## solution.w:
## 4.29e-08 5.41e-08 4.7e-08 1.708e-07 3.99e-08 2.6e-09 4.64e-08 4.75e-08 2.26e-08 0.9432516 2.3e-09 6.34e-08 5.88e-08 0.05674779
##
##
## X1, X0, Z1, Z0 all come directly from dataprep object.
##
##
## ****************
## searching for synthetic control unit
##
##
## ****************
## ****************
## ****************
##
## MSPE (LOSS V): 0.05638189
##
## solution.v:
## 0.002178725 0.04691905 0.03373661 0.0001555206 0.2074219 0.1320455 0.5775427
##
## solution.w:
## 1.9e-09 5.8e-09 1e-09 3.5e-09 1.36e-08 0.7335169 5.9e-09 3.29e-08 4.3e-09 0.266483 5.29e-08 3.2e-09 6.4e-09 6e-10
##
##
## X1, X0, Z1, Z0 all come directly from dataprep object.
##
##
## ****************
## searching for synthetic control unit
##
##
## ****************
## ****************
## ****************
##
## MSPE (LOSS V): 0.004197226
##
## solution.v:
## 0.4551697 0.03818898 0.2861802 0.007204225 0.0076159 0.07857643 0.1270646
##
## solution.w:
## 1.8877e-05 0.1530591 1.11658e-05 2.05297e-05 2.11094e-05 0.223438 1.16554e-05 6.476e-06 0.1665131 0.0947701 0.3620982 1.94457e-05 6.7588e-06 5.5139e-06
##
##
## X1, X0, Z1, Z0 all come directly from dataprep object.
##
##
## ****************
## searching for synthetic control unit
##
##
## ****************
## ****************
## ****************
##
## MSPE (LOSS V): 0.001222443
##
## solution.v:
## 0.2188493 0.009719276 0.07826532 0.2126716 0.1832151 0.2838626 0.01341685
##
## solution.w:
## 0.0007695733 0.1954709 5.61912e-05 0.3842992 0.0003365576 0.09042864 0.001226448 0.0003818197 0.1765464 0.0006891624 0.0003987502 0.0004105377 0.06109071 0.08789508
##
##
## X1, X0, Z1, Z0 all come directly from dataprep object.
##
##
## ****************
## searching for synthetic control unit
##
##
## ****************
## ****************
## ****************
##
## MSPE (LOSS V): 0.006485386
##
## solution.v:
## 0.0001945692 0.001505231 1.26679e-05 0.00271703 0.3063433 0.2283888 0.4608384
##
## solution.w:
## 0.0007217353 0.0002035562 0.0001712312 8.83637e-05 0.003553432 0.1001119 0.003784691 0.0002558063 0.0004863681 6.29e-08 0.3857844 0.0002456163 0.4297522 0.07484062
##
##
## X1, X0, Z1, Z0 all come directly from dataprep object.
##
##
## ****************
## searching for synthetic control unit
##
##
## ****************
## ****************
## ****************
##
## MSPE (LOSS V): 1.085804
##
## solution.v:
## 0.01865466 0.0863511 0.01732622 0.005106409 0.286594 0.1123574 0.4736102
##
## solution.w:
## 7.1e-09 2.1e-09 1 2.5e-09 1.9e-09 1.2e-09 2.6e-09 1.6e-09 3.4e-09 1.2e-09 1.1e-09 1.4e-09 3.2e-09 2.5e-09
# Plotting the placebo results for comparison with Sweden
data <- store
rownames(data) <- 1960:2005
gap.start <- 1 # Start year for the gap
gap.end <- nrow(data) # End year for the gap
years <- 1960:2005 # Range of years to be plotted
gap.end.pre <- which(rownames(data) == "1989") # End of the pre-treatment period (1989)
# Calculate the Mean Squared Prediction Error (MSPE) for each country in the pre-treatment period
mse <- apply(data[gap.start:gap.end.pre, ]^2, 2, mean) # Pre-treatment error for all countries
sweden.mse <- as.numeric(mse[13]) # MSPE for Sweden
# Exclude countries with MSPE more than 20 times higher than Sweden’s (set to 1000 to include all)
data_ <- data # All countries
data <- data[, mse < 20 * sweden.mse] # Filter out countries with excessive MSPEs
# Countries with excessive MSPEs
colnames(data_)[!colnames(data_) %in% colnames(data)]
## [1] "Canada" "Iceland" "Poland" "Portugal"
## [5] "United States"
Cex.set <- 1 # Size for text elements
# Plot the gap for Sweden and the control countries
plot(years, data_[gap.start:gap.end, which(colnames(data_) == "Sweden")],
ylim = c(-1, 1),
xlab = "Year",
xlim = c(1960, 2005),
ylab = "Gap in metric tons per capita (CO2 from transport)",
type = "l",
lwd = 2,
col = "black",
xaxs = "i",
yaxs = "i")
# Add lines for each control country
for (i in 1:ncol(data_)) {
lines(years, data_[gap.start:gap.end, i], col = "lightblue", lty = "dotted")
}
# Recolor the lines for the data without excessive MSPEs
for (i in 1:ncol(data)) {
lines(years, data[gap.start:gap.end, i], lwd = 1, col = "grey")
}
# Add the Sweden line again in black
lines(years, data_[gap.start:gap.end, which(colnames(data_) == "Sweden")], lwd = 2, col = "black")
# Add vertical dotted line at the year 1990 (introduction of Sweden's VAT and carbon tax)
abline(v = 1990, lty = "dotted", lwd = 2)
# Add a horizontal dashed line at 0 to represent no difference
abline(h = 0, lty = "dashed", lwd = 2)
# Add a legend
legend("bottomleft", legend = c("Sweden", "control countries", "filtered countries (Canada, Iceland, Poland, Portugal, USA)"),
lty = c(1, 1), col = c("black", "gray", "lightblue"), lwd = c(2, 1, 1), cex = .8)
# Add an arrow
arrows(1987, -0.48, 1989, -0.48, col = "black", length = .1)
# Add text to label the VAT + Carbon Tax introduction
text(1981, -0.48, "VAT + Carbon Tax", cex = Cex.set)
# Add boundary lines for the years 1960 and 2005
abline(v = 1960)
abline(v = 2005)
# Add boundary lines for the gap range (-1 to 1)
abline(h = -1)
abline(h = 1)
After excluding the countries with poor pre-treatment fit, we see that Sweden’s post-1990 \(CO_2\) emissions trend shows a clear downward trend compared to the control countries. This indicates a significant reduction in \(CO_2\) emissions following the introduction of the carbon tax, with Sweden performing distinctly better than the countries that did not implement similar measures.
The placebo tests strengthen the argument that Sweden’s \(CO_2\) reduction is not due to random factors, as the control countries do not exhibit similar reductions. The gap between Sweden and the synthetic controls after 1990 demonstrates the specific impact of the carbon tax.
Email: andri.gerber@stud.hslu.ch. Department of Business, Lucerne University of Applied Sciences and Arts, Lucerne, Switzerland. HSLU. ORCiD ID.↩︎
Email: stefan.durrer@stud.hslu.ch. Department of Business, Lucerne University of Applied Sciences and Arts, Lucerne, Switzerland. HSLU.↩︎