Handling Exception when bulkwrite with ordered as False in MongoDB java -
i using mongo db java driver:
collection.bulkwrite(documents);
i have 1000000 records inserted.if insertion of 1 of records failed on first failure remaining records wont inserted.. in order avoid found there bulkwrite options ordered false;
collection.bulkwrite(documents, new bulkwriteoptions().ordered(false) )
if exception occurs during above operation.can list of records bulkwrite failed , can try inserting records again:?
all bulkwrite operations fail reported in arrays in return document, along full document failed insert. example, code via java driver:
mongocollection<document> collection = database.getcollection("bulk"); list<writemodel<document>> writes = new arraylist<writemodel<document>>(); writes.add(new insertonemodel<document>(new document("_id", 1))); writes.add(new insertonemodel<document>(new document("_id", 2))); writes.add(new insertonemodel<document>(new document("_id", 2))); writes.add(new insertonemodel<document>(new document("_id", 4))); bulkwriteresult bulkwriteresult = collection.bulkwrite(writes); system.out.println(bulkwriteresult.tostring());
returns this:
exception in thread "main" com.mongodb.mongobulkwriteexception: bulk write operation error on server localhost:27017. write errors: [bulkwriteerror{index=2, code=11000, message='e11000 duplicate key error collection: test.bulk index: _id_ dup key: { : 2 }', details={ }}].
you didn’t mention release of mongodb using, here version 3.2 manual entry unordered bulkwrites: https://docs.mongodb.org/manual/core/bulk-write-operations/ if want further.
however, if “exception” meant in middle of load operation client or mongod processes crashed, or there hardware or network incident, there won’t clean exit bulkwrite , no bulkwriteresult output above. unordered bulkwrites execute operations in parallel , not in order entered, , more complicated sharded collections , distributed clusters, impossible know operations completed before crash. solution repeat whole job, requires remove documents inserted first time.
if loading new or empty collection can drop , recreate it. if there unique key index can repeat load documents inserted ok first time rejected duplicates. otherwise have run cleanup job remove documents could/should have been inserted before starting reload attempt, if aren’t identified can problematic.
the best approach scenario break large load operations smaller pieces, have 4 load jobs each 1 quarter of data (or twenty each 5%, , on). there more effort design load process way, faster repeat 1 job 5% of total data have repeat single load failed @ 95% mark.
if add loadnum field different value each of jobs, count(“loadnum”:n) can used confirm documents loaded, , remove(“loadnum”:n) remove documents job partially successful.
Comments
Post a Comment