Public data dump download numbers seem inconsistent with actual

chester.burbidge's Avatar

chester.burbidge

07 Jan, 2018 11:38 AM

I'm trying to get the top 1000 downloaded gems for a personal project.

I've downloaded and restored the data from https://rubygems.org/pages/data into a postgres database. When I join the versions to get the download counts with the command:

COPY (select r.name, v.full_name, v.authors, d.count from versions v join rubygems r on v.rubygem_id = r.id join gem_downloads d on v.id = d.id) TO '/tmp/gem_stats.csv' DELIMITER ',' CSV HEADER;
and analyse and sort the results by most downloaded I get wildly different results to the page https://rubygems.org/stats?page=1

Anyone know why this might be?

  1. Support Staff 1 Posted by Aditya Prakash on 22 Apr, 2018 01:13 PM

    Aditya Prakash's Avatar

    Hi chester,

    Sorry about delay in our response. I am not sure how can we help you if we don't know your complete process. Page you mentioned used this method to get top gems.

Reply to this discussion

Internal reply

Formatting help / Preview (switch to plain text) No formatting (switch to Markdown)

Attaching KB article:

»

Attached Files

You can attach files up to 10MB

If you don't have an account yet, we need to confirm you're human and not a machine trying to post spam.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac