1

I have a function which converts my large XML file to byte array using FileInputStream. It runs fine within my IDE but on when run independently via the executable jar , it throws Exception in thread "main" java.lang.OutOfMemoryError: Java heap space. I'm reading this large file in a byte array to store it as a Blob in the target DB. I don't have control over how the Blob is stored, I just have access to the stored procedure to insert the Blob. Is there a way to read and write chunks of data without loading the entire file in memory ?

function which converts file to byte array -

private byte[] getBytesFromFile(Path path) throws IOException {
    FileInputStream fis = new FileInputStream(path.toFile());
    byte[] bytes = new byte[(int) path.toFile().length()];
    int read = 0;
    int offset = 0;
    while(offset < bytes.length && (read = fis.read(bytes, offset, bytes.length - offset)) >= 0 ){
        offset += read;
    }
    fis.close();
    return bytes;
}

And here's the code which stores the byte array to db using the stored procedure call

private void storeFileToDb(Connection connection, int fileId, String fileName, String fileType, byte[] fileBytes) throws SQLException {
    //
    String storedProcedure = "{call SP(?,?,?,?,?) }";
    CallableStatement callableStatement = connection.prepareCall(storedProcedure);
    callableStatement.setInt(1, fileId);
    callableStatement.setString(2, fileName);
    callableStatement.setString(3, fileType);
    Blob fileBlob = connection.createBlob();
    fileBlob.setBytes(1, fileBytes);
    callableStatement.setBlob(4, fileBlob);
    callableStatement.registerOutParameter(5, OracleTypes.NUMBER);
    callableStatement.execute();
    fileBlob.free(); // not entirely sure how this helps
    //callableStatement.close();
}
2
  • How big those files can be? Do you really need to do that in chunks? If you know that they can't get too big then you could just increase Java heap space. As you've said that it works in IDE I assume that they aren't that large. Commented Jul 12, 2019 at 14:14
  • The largest we have encountered is 195MB raw XML file. In the future we might expect more than 200MB files. And I have no clue as to what the server config can be, so I'm trying to optimize wherever I can. Commented Jul 12, 2019 at 14:53

2 Answers 2

1

Use either CallableStatement.setBlob(int, InputStream) or Blob.setBinaryStream(long). Both methods will let work with InputStream or OutputStream objects and avoid creating byte[] array in the memory. Example is show in Adding Large Object Type Object to Database docs.

This should work as long as JDBC driver is smart enough not to create byte[] for the entire blob somewhere internally.

Sign up to request clarification or add additional context in comments.

5 Comments

I tried using callableStatement.setBlob(int, new ByteArrayInputStream(bytes)), but the jar file just stuck for half hour. Not sure why. How would i go about using Blob.setBinaryStream(long) ?
Don't create byte[] or ByteArrayInputStream,use the FileInputStream or something else that doesn't allocate everything in memory.
Thanks Karol Dowbecki, i changed the inputstream to FileInputStream and it worked. It does take a lot of time though, especially when storing the data to database.
You can try compressing with GZip or other encoding like Joop suggested. Also you could profile the INSERT with TKPROF and see why does it take so long.
Unfortunately the design of the database and how to insert the file is out of my control. If we do decide to change the way we store xml files in db, then I'll definitely consider using that. Thanks again!
1

It might be that the server was configured too restrictive. Now is a good time to check the memory parameters.

Blobs can be filled just providing an InputStream.

Also it is a good idea to compress XML data. Try it out: compress some test.xml to test.xml.gz, for the size gain.

Note there exists in standard java:

private byte[] getBytesFromFile(Path path) throws IOException {
    return Files.readAllBytes(path);
}

So:

private void storeFileToDb(Connection connection, int fileId, String fileName,
        String fileType) throws SQLException, IOException {
    Path path = Paths.get(fileName); // Or parameter
    try (CallableStatement callableStatement = connection.prepareCall(storedProcedure);
         GZipInputStream fileIn = new GZipInputStream(Files.newBufferedInputStream(path))) {
        ...
        callableStatement.setBlob(4, fileIn);
        ...
    }
}

The try-with-resources ensures closing in case of a thrown exception or return or such. Also useful for the statement.

You did not close the statement, having a Blob inside. That is not advisable, as the data may hang around a while. A CallableStatement is a PreparedStatement too, where one use-case is repeatedly executing the SQL with possibly other parameter values. Or not.

And for decompressing GZipOutputStream.

1 Comment

The fileName parameter is just the name, the file exists only temporarily in the file system under some random name. But I understand that I need to close the statement if I were to free up some space. I didn't realize that it could make a difference. Thanks for pointing it out! I'll try this out and see if it works.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.