First Part shows some basic statistics on Cultural Data Project. In order to get a better insight how do these companies make their money and different ways to monetize their services, I looked at the correlation of their revenue with other metrics. This could be quite important if the companies knew which type of marketing channels matter more for generating revenue, they could both allocate their efforts and also spendings more efficiently to increase their revenue and hopefully their profits.
Advertising vs. Revenue
As you could see from the above graph, there is a correlation between advertising and revenue. However, there are a lot of data points that have zero for advertising. This could be due to organizations do not spend any money for advertisement or incomple data. The latter seems more reasonable especially when we consider the values close to zero are not present in the dataset. This shows a strong correlation if we exclude the incomple data points. Let's look at the correlation more closely, like doing a linear regression.
Linear Regression on Advertising vs. Revenue
Now, that is much better. We could see the correlation much better now where the correlation coefficient is Pearson correlation coefficient. But data have more than total revenue; earned revenue and contributed revenue. Earned revenue suggests an income of selling good and services of the organization where the contributed revenue is how much an organization collects money from contributors. Let's see first earned revenue:
Linear Regression on Advertising vs. Earned Revenue
Linear Regression on Advertising vs. Contributed Revenue
To my surprise, the correlation between advertising and contributed revenue is actually higher than earned revenue. Does this imply advertising attracts more contributor? Maybe, but not necessarily. This is most probably due to most of the organizations are highly dependent on the contributions as their earned revenues are quite modest which could be seen from the graphs above.