Skip to content

Commit

Permalink
Removed dependency on csv gem for load_movielens
Browse files Browse the repository at this point in the history
  • Loading branch information
ankane committed Jun 10, 2024
1 parent 5bf3a58 commit cb9ba53
Show file tree
Hide file tree
Showing 4 changed files with 9 additions and 9 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
## 0.4.2 (unreleased)

- Removed dependency on `csv` gem for `load_movielens`

## 0.4.1 (2024-05-23)

- Reduced memory for `item_recs` and `similar_users`
Expand Down
1 change: 0 additions & 1 deletion Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,3 @@ gem "matrix" # for daru
gem "rover-df"
gem "ngt", ">= 0.3.0"
gem "faiss"
gem "csv"
1 change: 0 additions & 1 deletion gemfiles/activerecord72.gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,3 @@ gem "matrix" # for daru
gem "rover-df"
gem "ngt", ">= 0.3.0"
gem "faiss"
gem "csv"
12 changes: 5 additions & 7 deletions lib/disco/data.rb
Original file line number Diff line number Diff line change
@@ -1,23 +1,21 @@
module Disco
module Data
def load_movielens
require "csv"

item_path = download_file("ml-100k/u.item", "https://files.grouplens.org/datasets/movielens/ml-100k/u.item",
file_hash: "553841ebc7de3a0fd0d6b62a204ea30c1e651aacfb2814c7a6584ac52f2c5701")
data_path = download_file("ml-100k/u.data", "https://files.grouplens.org/datasets/movielens/ml-100k/u.data",
file_hash: "06416e597f82b7342361e41163890c81036900f418ad91315590814211dca490")

# convert u.item to utf-8
movies_str = File.read(item_path).encode("UTF-8", "ISO-8859-1")

movies = {}
CSV.parse(movies_str, col_sep: "|") do |row|
File.foreach(item_path) do |line|
# convert u.item to utf-8
row = line.encode("UTF-8", "ISO-8859-1").split("|")
movies[row[0]] = row[1]
end

data = []
CSV.foreach(data_path, col_sep: "\t") do |row|
File.foreach(data_path) do |line|
row = line.split("\t")
data << {
user_id: row[0].to_i,
item_id: movies[row[1]],
Expand Down

0 comments on commit cb9ba53

Please sign in to comment.