Team members responsible for this notebook:
List the team members contributing to this notebook, along with their responsabilities:
- Tiffany Wong: Writing the topic sections and the challenges section
- Timothy Yau: Writing the findings section, interpreting maps we got
- Biying Li: Writing the findings section, explaining the regression results, loading graphs into the notebooks
- Cynthia Wu: Writing and editting findings section
- Daniel Zezula: Writing and editting the finding section, running the slideshow function
Topic¶
With the rise of the importance of technology in the later part of the 20th century and onward, many new jobs have been created that fall under the category of the high-technology industry. Our team wanted to analyze the spillover effect of employment in high-technology industries over employment in other industries in the United States. We decided to look specifically at the industries of manufacturing and retail in order assess the spillover effect.
To give a brief description of this topic, the technology spillover effect is how growth in the technology sector impacts employment levels in various other industries, such as manufacturing and retail. The source of this impact may be attributed to the fact that increased technology may open up job opportunities in non-tech industries. Growth in the technology industry may also negatively impact these industries by creating technology that can automate certain jobs and therefore make them obsolete. Therefore, an increased amount of high-tech employment correlates with an increased amount of new technology which may either benefit or adversely affect the dependent industries.
To learn even more about the technology spillover effect, some of our team members talked to Professor Moretti, an Economics professor at Cal who has done research and written articles on this topic. Besides presenting us with valuable information on how to approach our analysis, he also graciously provided us with a data set that he compiled for one of his earlier studies. This data set contained information about the number of people employed in certain industries in specific cities and states over the years of 1980, 1990, and 2000. He gathered his data from https://usa.ipums.org/usa/.
In our project, the high-technology industries are defined to be those that work with computers and related equipment, computer and data processing services, engineering, architectural, and surveying services, and other similar jobs. We've defined the industry of manufacturing as related to the jobs of creating products such as furniture, plastic goods, and iron and steel foundries. The retail industry has been defined as including jobs such as those in department stores, shoe stores, and food stores.
Some of the questions that we want to address in the project include:
How does high-technology sector employment affect other industries?
Which industry has experienced the highest influence from the growth in high-technology sector?
Does the spillover effect affect states differently?
Challenges¶
It was difficult for us to find the data that specifically showed the number of people employed in certain industries in different states. We tried to look for employment data by state, but that proved to be a tedious process since we wanted to analyze the effect over the entire United States. Eventually this problem was solved after talking to Professor Moretti.
Our initial difficulties we had was opening the raw dataset in R. Since the size of our original data set was too large, everytime we tried to load it in R, it would crash. We tried several times to load the data in Stata and SQL, and even opened it in Excel, before eventually successfuly loading the dataset in Stata, and had to trim the data there first. Since none of us really knew how to use Stata, it was a challenge to learn the right techniques to execute the inital trimming of the data.
Another difficulty that we ran into was figuring out how to run the maps function to create graphs with our data. We had to run a regression for each state, so we just wrote a for loop to run the regression on all the states. We had to run regression 50 times and save the coefficients. Also, we ran into some trouble writing the code to replace the names of the city with the corresponding state, since the maps function only takes in state names and not city names. It was also a challenge to write the for loops to run the regression on all the states.
Presentation of Findings¶
As described in Data Analysis Notebook, we ran log regression for employment in retail industry and manufacturing industry on that in hightech industry. With data in 1980, 1990, and 2000, we are able to generate 6 regression results: LogRegM1980 (results for log regression of manufacturing employment in 1980), LogRegM1990, LogRegM2000, LogRegR1980, LogRegR1990, LogRegR2000.
The first graph below shows the regression results for manufacturing industry. As we can see in the first result, for all three years we are considering, manufacturing employment has coefficients smaller than 1. This indicates that when employment in hightech industry increases by 1, the employment in manufacturing tend to go up by about 0.8. Therefore the hightech industry employment has a small spill-over effects on the manufacturing industry. Hightech employment didn't have obvious effects on stimulating job growth in manufacturing industry.
While for retail industry, the findings are quite different. The second graph below shows the regression results for retail industry. For all three years we are considering, the regression coefficients for retail industry are above 1. This shows that for 1 unit increase in hightech employment, there is about 1.5 increase in jobs in retail industry. Therefore, we can say that hightech employment has a positive spillover, which stimulates the growth in retail jobs.
from IPython.core.display import Image
Image(filename=('../graphs/LogRegM_summ.png'))
from IPython.core.display import Image
Image(filename=('../graphs/LogRegR_summ.png'))
from IPython.core.display import Image
Image(filename=('../graphs/h8m8.jpg'))
from IPython.core.display import Image
Image(filename=('../graphs/h9m9.jpg'))
from IPython.core.display import Image
Image(filename=('../graphs/h0m0.jpg'))
from IPython.core.display import Image
Image(filename=('../graphs/h8r8.jpg'))
from IPython.core.display import Image
Image(filename=('../graphs/h9r9.jpg'))
from IPython.core.display import Image
Image(filename=('../graphs/h0r0.jpg'))
Moreover, we also performed a separate regression so that we may visualize the regression results on a color-coded map of the US. In order to perform the state regression, we had to first relabel the location column entries so that all the cities would be replaced with their corresponding states, so that we may focus on analyzing the differences in coefficients among the states rather than among the cities.
After regressing the state data and visualizing the data using the US map, we discovered that there is a contrast between the hightech-manufacturing regression and the hightech-retail regression. For the hightech-manufacturing regression, a majority of the states had regression coefficients that were negative, which indicates that increases in high-tech employment lead to a decrease in manufacturing employment. On the other hand, a majority of the states in the hightech-retail regression had regression coefficients that were positive, which implies that increases in high-tech employment lead to an increase in retail employment. This makes sense because an increase in high-tech employment naturally leads to more innovation in technology, with tech workers creating more new technologies that might automate jobs in the manufacturing sector and therefore make certain manufacturing jobs obsolete. It also increases employment in the retail industry because new technology employers will increase the needs for other services like food.
Based on these findings, we conclude that high-tech growth tends to curb employment in the manufacturing industry, while the same growth tends to spur employment growth in the retail industry. This confirms our prediction that high-tech employment growth would adversely affect manufacturing employment while benefit retail employment.
from IPython.core.display import Image
Image(filename=('../graphs/mapplot-manu.png'))
from IPython.core.display import Image
Image(filename=('../graphs/mapplot-retail.png'))