About The Rankings

The Data Science Ranking is a rank of the popularity of data science based tools. The rank is recalculated monthly using metrics pulled from a variety of sources. The ranking is created and managed by the team behind Dremio, an open-source data tool. If you have feedback or suggestions for tools to add to the ranking, email us at contact@dremio.com

Rank Name Type Score
1  0 Python Language 927
2  0 SQL Language 428
3  0 R Language 234
4  0 MATLAB Language 119
5  0 TensorFlow Library 83
6  0 Scala Language 75
7  0 Pandas Library 70
8  0 NumPy Library 53
9  0 SciPy Library 42
10  3 SAS Software 42
11  3 Anaconda Software 39
12  1 Spark Software 34
13  3 scikit-learn Library 34
14  2 Keras Library 31
15  0 Scrapy Library 19
16  3 KNIME Software 12
17  1 RapidMiner Software 9
18  1 Statsmodels Library 7
19  1 Theano Library 7
20  4 Dremio Software 6
21  0 Arrow Library 5


Our popularity score is calculated by averaging and weighting a handful of metrics. The calculated "score" is a relative value, meaning the score of one tool is only meaningful when compared to the score of another. The metrics we use in rough order of importance are:

Technical Interest - Calculated with each tool's total number of tagged questions and question views within the last year on StackOverflow.

Search Presence - Calculated using the total number of results returned by search engines as well as the monthly search volume for keywords. For the count of search results we use each term in conjunction with its category to increase the relevance of the search - e.g. "Tableau Software". For monthly search volume we use each term in conjunction with an action-oriented qualifier - e.g. "Install NumPy".

Job Interest - Calculated using the number of people on LinkedIn who have shown interest in each topic and the number of LinkedIn job postings that include the term.

Search Trend - Calculated with Google Trends data. We compare the tool's current search volume relative to its search volume within the past year.

Domain Strength - Calculated from the total number of external domains linking to the tool's website.