Misc Tools

The misc package has various tools for splitting/merging indices, changing norms, finding high freq terms, and others.

NativeUnixDirectory

NOTE: This uses C++ sources (accessible via JNI), which you'll have to compile on your platform.

{@link org.apache.lucene.store.NativeUnixDirectory} is a Directory implementation that bypasses the OS's buffer cache (using direct IO) for any IndexInput and IndexOutput used during merging of segments larger than a specified size (default 10 MB). This avoids evicting hot pages that are still in-use for searching, keeping search more responsive while large merges run.

See this blog post for details. Steps to build:

NativePosixUtil.cpp/java also expose access to the posix_madvise, madvise, posix_fadvise functions, which are somewhat more cross platform than O_DIRECT, however, in testing (see above link), these APIs did not seem to help prevent buffer cache eviction.