Ruby on Rails — Importing Data from an Excel File

Photo by Rich Tervet on Unsplash

Importing data from an Excel spreadsheet doesn’t need to be difficult. While Rails doesn’t include a native utility to handle these file types, there are several gems that make it quite easy to read/write Excel spreadsheets.

In this post, we will create a rake task and use the roogem to read from an Excel file and import data into the database.

Setup

We’ll start by adding the roo gem to our project. If you’re using Bundler 1.15 or higher, you can use the command.

This will add roo to the and install it. Alternatively, you can update the and install it manually.

Add this line to your .

And install.

Creating the Rake Task

Now that we’ve added roo, we can start working on the actual feature.

Generate the rake task.

This will generate the file in where “import" is the namespace and “data" is the task name.

Now that our task is created, let’s update the description. We’ll also add a simple puts command and ensure that it runs.

Run this command in your console.

Importing the Data

To keep it simple, I have a very basic user schema. Where a user has two fields, name and email. I’ve added the spreadsheet file at lib/data.xlsx which contains several rows, each representing a new user to be created.

As you can see, the first row of the spreadsheet represents the headers. Everything else is the actual data that we need. Let’s start implementing roo so we can map over this data and create the users.

First, we will require roo and open the spreadsheet within the task.

Next, lets grab the first row of the spreadsheet since we know this is the header row. We’ll use this later to create a hash when mapping over the rows with the data we need.

Now we can map over the spreadsheet rows and extract the user data. Here is what the code looks like.

Let’ walk through this.

We map over each row in the spreadsheet using .

If we’re on the first row (), we want to continue to the next iteration because this is the header row.

Next, we create a new hash that contains the data from the current row. We build this hash using the array that we grabbed earlier and the current that we are on in the loop. So, what is this line doing?

Let’s start with Array#transpose. I’m not going to dive into the transpose method but it essentially turns columns into rows when you have a multi-dimensional array. Here’s a quick visual.

This is what the array looks like before calling .

And this is what is returned when calling .

When passing the above array to , we get a back a new hash where the is the first item in each nested array and its value is the second. In our case, the hash matches the user fields we need to create a new database entry.

Then we check to see if a user already exists with the current email address. If it does, we print some text to the console and move on to the next iteration without saving.

And finally, if the user doesn’t already exist, we can create a new user instance with the fields generated from the current row and save the new user in the database.

The complete code should look something like this.

That’s it! Now you can run this task and import users from the spreadsheet.

While this approach uses a very simple data structure, it’s a good starting point even for more complex situations.

More about:

Gabriel Martin