PgBulkInsert is a Java library for Bulk Inserts to PostgreSQL using the Binary COPY Protocol.
It provides a wrapper around the PostgreSQL COPY command:
The COPY command is a PostgreSQL specific feature, which allows efficient bulk import or export of data to and from a table. This is a much faster way of getting data in and out of a table than using INSERT and SELECT.
This project wouldn't be possible without the great Npgsql library, which has a beautiful implementation of the Postgres protocol.
PgBulkInsert is available in the Central Maven Repository.
You can add the following dependencies to your pom.xml to include PgBulkInsert in your project.
<dependency>
<groupId>de.bytefish</groupId>
<artifactId>pgbulkinsert</artifactId>
<version>3.3</version>
</dependency>
- Numeric Types
- smallint
- integer
- bigint
- real
- double precision
- numeric
- Date/Time Types
- timestamp
- date
- Character Types
- text
- Character Types
- text
- JSON Types
- jsonb
- Boolean Type
- boolean
- Binary Data Types
- bytea
- Network Address Types
- inet (IPv4, IPv6)
- macaddr
- UUID Type
- uuid
- Array Type
- One-Dimensional Arrays
- hstore
- hstore
- Geometric Types
- point
- line
- lseg
- box
- path
- polygon
- circle
PgBulkInsert is released with under terms of the MIT License:
Imagine we want to bulk insert a large amount of persons into a PostgreSQL database. Each Person
has a first name, a last name and a birthdate.
The table in the PostgreSQL database might look like this:
CREATE TABLE sample.unit_test
(
first_name text,
last_name text,
birth_date date
);
The domain model in the application might look like this:
private class Person {
private String firstName;
private String lastName;
private LocalDate birthDate;
public Person() {}
public String getFirstName() {
return firstName;
}
public void setFirstName(String firstName) {
this.firstName = firstName;
}
public String getLastName() {
return lastName;
}
public void setLastName(String lastName) {
this.lastName = lastName;
}
public LocalDate getBirthDate() {
return birthDate;
}
public void setBirthDate(LocalDate birthDate) {
this.birthDate = birthDate;
}
}
Then you have to implement the AbstractMapping<Person>
, which defines the mapping between the table and the domain model.
public class PersonMapping extends AbstractMapping<Person>
{
public PersonMapping() {
super("sample", "unit_test");
mapText("first_name", Person::getFirstName);
mapText("last_name", Person::getLastName);
mapDate("birth_date", Person::getBirthDate);
}
}
This mapping is used to create the PgBulkInsert<Person>
:
PgBulkInsert<Person> bulkInsert = new PgBulkInsert<Person>(new PersonMapping());
And finally we can write a Unit Test to insert 100000
Persons into the database. You can find the entire Unit Test on GitHub as IntegrationTest.java.
@Test
public void bulkInsertPersonDataTest() throws SQLException {
// Create a large list of Persons:
List<Person> persons = getPersonList(100000);
// Create the BulkInserter:
PgBulkInsert<Person> bulkInsert = new PgBulkInsert<Person>(new PersonMapping());
// Now save all entities of a given stream:
bulkInsert.saveAll(PostgreSqlUtils.getPGConnection(connection), persons.stream());
// And assert all have been written to the database:
Assert.assertEquals(100000, getRowCount());
}
private List<Person> getPersonList(int numPersons) {
List<Person> persons = new ArrayList<>();
for (int pos = 0; pos < numPersons; pos++) {
Person p = new Person();
p.setFirstName("Philipp");
p.setLastName("Wagner");
p.setBirthDate(LocalDate.of(1986, 5, 12));
persons.add(p);
}
return persons;
}