LOAD DATA INFILE vs. INSERT in MySQL: Why, How, When?

This article will demonstrate the differences between LOAD DATA INFILE and INSERT and explain when you should use both for data entry in MySQL.

a month ago   •   5 min read

By Lukas Vileikis
Table of contents

Introduction

If you have been working with MySQL for a while, you probably already know that MySQL offers multiple methods for importing data. One of those – the INSERT INTO statement – could be considered a standard way to do so. However, MySQL also has another way that dumps can be imported – LOAD DATA INFILE. Such a statement might be helpful if we find ourselves working with a lot of data inside our database instances. However, when we are facing those two choices, we probably are not sure which one of them is better – that's what we are looking into today.

INSERTing Data

To begin with, the INSERT statement, as its name suggests allows us to insert data into a table. In its most basic form, the statement looks like so:

INSERT INTO demo_table (column_1) VALUES ('Demo Data');

As you can see, the statement is comprised of a few parts:

  1. The INSERT INTO statement tells MySQL we want to INSERT data instead of SELECTing, deleting, or updating it.
  2. demo_table is the name of the table into which we want to insert the data.
  3. column_1 refers to the column name of the column to which we want to insert data.
  4. VALUES specifies that values will be set after this parameter.
  5. 'Demo Data' inserts a row with the text 'Demo Data' inside a column column_1.

INSERT Examples

After we would run such a SQL statement, some demo data will be added to our table. We can also run INSERT statements like so:

INSERT INTO arctype (column, column_2) VALUES ('Demo', 'Demo 2');

In this case, a value of Demo would be inserted into a column called column and a value of Demo 2 would be inserted into column_2.

Also, keep in mind that when we're adding values for every column in the table, we don't need to specify columns:

INSERT INTO arctype VALUES ('Demo', Demo 2');

It would work just the same as an earlier example.

Also, we could make use of inserting multiple rows at once, a query like so:

INSERT INTO arctype (demo_1, demo_2) 
VALUES ('Demo Data', 'Demo 2'),('Demo Data Again', 'Data123'),('Data here', 'Data here too');

It would flawlessly execute too! The upside of this is that we also reduce calls made to the database itself.

When INSERT Doesn't Work

However, the INSERT query also comes with some caveats: the data inserted into the table must conform with the data type of the column set in the table. What that means is that if, for example, we would run a query like so:

INSERT INTO arctype (number_column) values ('Demo');

Where number_column is of type INT, MySQL would not execute the query and instead would return an error like so:

#1366 - Incorrect integer value: 'Demo' for column 'number_column' at row 1

The error is pretty self-explanatory: the number_column column accepts numbers and not text values: we have fed it something else, and MySQL did not like it.

Importing Large Amounts of Data

Inserting data into your MySQL instances gets you pretty far. Still, at some point in your career as a MySQL DBA, you might need to look into INSERT statements and import rather large amounts of data (data comprised of 100,000 rows or more) into your database instances. A couple of big data-based projects could make use of this feature (for example, did you know that one of the most significant data breach search engines in the world was built on MySQL?). Still, to fully understand it, we must realize our options.

Using LOAD DATA INFILE

To import large amounts of data into MySQL, we can either use the technique for inserting multiple rows (outlined above) or use LOAD DATA INFILE instead. As the name suggests, LOAD DATA INFILE loads the data into a table from a file.  The basic syntax of the LOAD DATA INFILE query looks like so:

LOAD DATA INFILE '/path/to/file.csv' INTO TABLE demo_table;

Where /path/to/file.csv depicts a path to the file we want to load data from and demo_table represents the table's name.

LOAD DATA INFILE also has other options that can be specified. For example, we can specify columns that we want to load data into at the end of the statement as well:

LOAD DATA INFILE '/path/to/file.csv' INTO TABLE demo_table (demo_column);

What is IGNORE?

A statement like the one above would only insert data into one column called demo_column. We can also specify an IGNORE keyword if we so desire:

LOAD DATA INFILE '/path/to/file.csv' IGNORE INTO TABLE demo_table;

This statement will IGNORE any errors it encounters and loads the data regardless. The IGNORE statement can also be used in combination with specific columns:

LOAD DATA INFILE '/path/to/file.csv' IGNORE INTO TABLE demo_table (demo_column);

This statement is very powerful for one simple reason: if our given dataset is vast and we don't even need half of the information in it (say, we only need the values of the first column, but not the second, third, etc.), this statement would conveniently "forget" to load data into those columns and ignore any errors that MySQL might encounter on the way. Talk about saving time.

Also, keep in mind that LOAD DATA INFILE can also be LOCAL meaning that you can quickly execute a query like LOAD LOCAL DATA INFILE as well: that would work too. The LOCAL statement changes the location where MySQL can expect the data: if the LOCAL statement is in use, MySQL would expect the data to be located somewhere on the client host; otherwise, it should reside on the server host.

Why is LOAD DATA INFILE So Fast?

When working with MySQL and comparing the import time between INSERT INTO and LOAD DATA INFILE statements in MySQL, you are likely to notice that the LOAD DATA INFILE statement loads data into a database way, way faster than INSERT does. The reason why is pretty simple: LOAD DATA INFILE bothers MySQL less than INSERT statements do: for example, the LOAD DATA INFILE statement can skip lines, columns, or, if we wish, load data only into specific columns, skipping all others (refer to the example above.) The INSERT statement offers no such features; although it can easily insert a couple of rows at once, it's ill-equipped to handle data sets more extensive than, say, a million rows. That's not to say that LOAD DATA INFILE is without its flaws either, though; for example, run LOAD DATA INFILE with 100 million or more rows, and you will notice that it will probably slow down after a while: to avoid that, you might want to break up the files you use to load data into the database and instead of loading data with, say, 100 or 200 million rows at once, load it in smaller chunks: have, for example, ten files with 10 million rows each, get your LOAD DATA INFILE statements from a loop and run them one by one. You're done!

Modifying Data with Arctype

If you frequently find yourself INSERTing data into MySQL, chances are you are using an editor. A SQL editor like Arctype will let you create and edit tables with ease. It will allow you to share queries with your team as well – all with blazing fast performance. If you need a SQL editor, definitely give Arctype a try. For example, here's how the process of editing tables might look like when Arctype is in use:

Editing Tables with Arctype

And that's not even it! If you so desire, you can edit constraints or foreign keys as well: all without worrying about running queries that modify them for you!

Modifying Constraints with Arctype

If you find yourself a frequent reader of the Arctype blog, you have probably noticed that this blog covers an extensive range of topics ranging from transactions in PostgreSQL to making big data with MySQL work. If you enjoy the topics, make sure to stick around for a while: we have many other things coming up!

JOIN the Arctype Newsletter
Programming stories, tutorials, and database tips every 2 weeks

Spread the word

Keep reading