Berkeley DB Java Edition Cleaner and Checkpointer Notes
My colleague Mark Hayes posted an excellent writeup of the JE Cleaner and how it relates to the checkpointer. It is cut-and-pasted below.
The JE cleaner daemon thread(s) are enabled by default. Normally this should not be changed. Possible reasons for disabling the JE cleaner threads are:
1) You may wish to disable the JE cleaner threads during heavy application usage periods and only run the log cleaner when application usage is light (e.g., at 2 am). This can increase throughput during heavy usage periods. However, caution is strongly advised. If the write rate is high during the heavy usage period, filling the disk is a possibility and must be avoided. You must also ensure that there is enough time during light usage periods for the log cleaner to catch up with the backlog created during the heavy usage periods. In addition, random reads may be negatively impacted during the heavy usage periods if the JE log grows very large, because there may be less hits in the file system cache.
2) You may wish to disable the JE cleaner and checkpointer threads when performing a “bulk load”. A bulk load is a large set of writes, usually inserts but sometimes also updates and deletions, that is performed in a batch mode while all other application functions are disabled. It is used to initialize a large data set. The objective is to complete the load as quickly as possible and to use as little disk space as possible. Note that deferred write mode (see DatabaseConfig.setDeferredWrite) is often used for a bulk load to minimize writing.
Checkpointing can be disabled to avoid wasting disk space with multiple, redundant checkpoints during the load. Instead a single checkpoint is performed after the load is complete. This is acceptable because recovery time does not need to be bounded by checkpoints — if a crash occurs during the load, the load can be restarted from scratch. Log cleaning can also be disabled to speed up the load. If only insertions are performed, then log cleaning will not be needed anyway. But even if updates and deletions are performed, log cleaning is not productive while the checkpointer is disabled since log files will not be deleted. Log cleaning may be performed efficiently by calling cleanLog at the end of the load, followed by a checkpoint.
3) You may wish to implement your own log cleaning threads for administrative reasons. Perhaps you have a special thread pool you wish to use, or you’re sharing a thread pool with other components. In this case, your threads take on the same role as the JE daemon threads. Your threads should call Environment.cleanLog periodically. The number of threads calling cleanLog should be increased when the EnvironmentStats cleanerBacklog value grows. A checkpoint is not normally necessary, since checkpoints should occur independently on their own schedule. But if you also disable the JE checkpointer thread, then you should call Environment.checkpoint periodically from your own thread.
4) Using a NAS (e.g., NFS) for JE storage can be problematic for several reasons. For one, the EnvironmentConfig.LOG_USE_ODSYNC parameter must be set. In addition, if the NAS does not support the file locking needed by JE, then running multiple processes is problematic. JE cannot use file locking, and therefore cannot coordinate multiple processes accessing the same environment. It is then up to the application to ensure that only one process is writing to the environment, and that log cleaning is disabled when any read-only processes are open. The log cleaner threads may need to be disabled by the application in such situations.
Below are some example use cases where calling Environment.cleanLog is needed.
A) If you implement your own log cleaning threads (3) then you should call cleanLog periodically. The JE daemon threads effectively call cleanLog after each N bytes of log is written, where by default N is 0.25 times the maximum size of a log file, and may be configured using EnvironmentConfig.CLEANER_BYTES_INTERVAL. For simplicity your log cleaning threads may call cleanLog based on a configured time interval. As mentioned above (3), a checkpoint is not normally necessary after calling cleanLog.
B) The JE cleaner threads are triggered by write activity. You may wish to call cleanLog in order to force cleaning to occur when no other write activity is occurring. For example, you may wish to do this at the end of a bulk load (2), or as a utility function. After calling cleanLog, a checkpoint should be performed to cause cleaned log files to be deleted.
For completeness, I’d like to say a little more about checkpoints and log cleaning. As mentioned above, a checkpoint is necessary after the log cleaner has “processed” a log file, and before the file can be deleted. The log cleaner (the JE cleaner threads and the cleanLog method) process log files by migrating all active data from that file to the end of the log. The checkpoint is necessary before deleting the file, to ensure that no references to that log file remain active.
In addition, the checkpoint does a lot of the work — the heavy lifting — of log cleaning. When a log file is processed, the active data is placed in memory. But it is left to the checkpoint to write the active data to the end of the log. This has several advantages:
1) It offloads some of the work from the log cleaner threads, so they can make better progress and keep up with the application threads.
2) It reduces the total amount of writing by deferring it for as long as possible. Multiple updates to the active data are consolidated when writing is deferred until the next checkpoint.
3) Data is clustered naturally when writing is deferred. Data is written by the checkpointer in groups of records, where the records in a group have key values in close proximity to each other. For applications having locality of reference by key value, but where the records are initially written in a different order in the log, read performance may be improved.
In some applications, however, this approach can cause very long checkpoints, with negative repercussions. In particular, this can occur when the JE cache is very large (e.g., multiple GB) and the write rate is high. Because of the large cache, write activity and related log cleaner activity can queue up a large amount of work that must be done during each checkpoint. If the checkpoint takes too long (if it spans many log files) then the recovery interval may be very long also, and recovery after a crash may take a very long time. Long checkpoints also prevent cleaned log files from being deleted promptly.
For such applications, the EnvironmentConfig.CHECKPOINTER_HIGH_PRIORITY configuration parameter should be set to true. This causes two changes in behavior:
a) The log cleaner threads (and the cleanLog method) will write active data to the end of the log, rather than leaving this work to be done by the checkpointer.
b) The checkpointer will log multiple Btree nodes at a time, reducing contention with other threads.
Both of these changes cause the checkpoint to complete in much less time. This can have a significant positive impact on overall performance. If your application has long checkpoints (as usual, watch the EnvironmentStats), you should consider this option.
If you use this option, it is very likely that you’ll also need to increase the number of log cleaner threads. The checkpointer will be doing less work, but the log cleaner thread(s) will be doing more work. Therefore more log cleaner threads will probably be needed to prevent the cleaner backlog from growing.
Tags: Ace, Advantage, Amp, Ant, Application Functions, Application Usage, Applications, Backlog, Batch Mode, Berkeley Db, Bet, Blog, Blogs, Bot, Cache, Cep, Checkpoint, Checkpoints, Colleague, Completeness, Configuration Parameter, Configured, Crash, Daemon, Delete, Deletions, Dis, Disk Space, Ef, Environment, Environments, Forums, Groups, Heck, Hp, Ins, Insertion, Insertions, Java, Java Edition, Logs, Long Time, Lot, Lt, Mark Hayes, Memory, Messageid, Nda, Nfs, Nod, Odi, Oid, Oracle, ORACLE BLOGS, Oss, Oth, Periods, Pointer, Pool, Ppr, Queue, Rac, Ram, Rds, Reason, Recovery Time, Redundant Checkpoints, Reference, Rman, Running, Sec, Ses, Sit, Storage, Sync, System Cache, Tag, Threads, Throughput, Time Interval, Ui, Usage Period, Use Case, Writeup