DynamoDB - How to do incremental backup? -
i using dynamodb tables keys , throughput optimized application use cases. support other ad hoc administrative , reporting use cases want keep complete backup in s3 (a day old backup ok). again, cannot afford scan entire dynamodb tables backup. keys have not sufficient find out "new". how do incremental backups? have modify dynamodb schema, or add tables this? best practices?
update: dynamodb streams solves problem.
dynamodb streams captures time-ordered sequence of item-level modifications in dynamodb table, , stores information in log 24 hours. applications can access log , view data items appeared before , after modified, in near real time.
i see 2 options:
generate current snapshot. you'll have read table this, can @ slow rate stay under capacity limits (scan operation). then, keep in-memory list of updates performed period of time. put these in table, you'll have read those, too, cost much. time interval minute, 10 minutes, hour, whatever you're comfortable losing if application exits. then, periodically grab snapshot s3, replay these changes on snapshot, , upload new snapshot. don't know how large data set is, may not practical, i've seen done great success data sets 1-2gb.
add read throughput , backup data using full scan every day. can't afford it, isn't clear if mean paying capacity, or scan use capacity , application begin failing. way pull data out of dynamodb read it, either or consistent. if backup part of business requirements, think have determine if it's worth it. can self-throttle read examining
consumedcapacityunits
property on results. scan operation has limit property can use limit amount of data read in each operation. scan uses consistent reads, half price of consistent reads.
Comments
Post a Comment