Fast Notes view reading via C API Part 2: How to read views 99.7% faster than Notes.jar
Karsten Lehmann 25 April 2010 23:52:56
This is a follow-up article to the first one about using the C API to read Notes view data instead of the Notes Java API provided by the Notes.jar file.Disclaimer
You will see some incredible numbers in this article. Be sure, that we double checked them. This is no fake, although it looks like one. :-)
Testing our approach on a server
In the first article, we compared the execution times of the C code with the Notes.jar performance in a test case of reading 15.000 view entries from a local database. The result was 0.93 seconds for the C version and 2.8 seconds for the Java API. Pretty nice. That's a third of the execution time of Notes.jar, but we discussed that this may only make sense in cases where you really have a lot to read, e.g. several thousands of view entries, because there is some initialization overhead for creating a second Notes session and opening the database and view.
So far, we only had done tests on the local machine and we were curious how well our code would work on a server database.
Our test server is one of our development machines, which is hosted at an internet service provider and has a 100 mbit connection to the internet. Our home office internet connection speed is 15224 kbit/s downstream and 1164 kbit/s upstream.
Reading all the view entries using Notes.jar gives us a good opportunity to look for some coffee. It takes 856735 ms, which is more than 14 minutes. There is no passthru server in between. The results were even slower when we were testing with a passthru server and a 1,5 mbit server connection.
Our code looks like this:
Session session=NotesFactory.createSession();
Database db=session.getDatabase(serverName, dbPath);
View v=db.getView(viewName);
v.setAutoUpdate(false);
ViewEntryCollection entryCol=v.getAllEntries();
ViewEntry entry=entryCol.getFirstEntry();
while (entry!=null) {
if (entry.isValid()) {
Vector colValues=entry.getColumnValues();
//
//... code to dump the values to stdout
//
}
ViewEntry nextEntry=entryCol.getNextEntry();
entry.recycle();
entry=nextEntry;
}
session.recycle();
So let's see how long our C approach takes... With the implementation of our first article, it takes 433337 ms, which is half the time - 7 minutes. Not bad, but still enough time for coffee.
Reading more than one view entry at a time
We were really surprised when we saw that result. We had expected that IBM might have added some caching technologies to read more than one view entry at a time, to optimize throughput on the slow network connection. But our C code was still a lot faster than their approach, and it only read one entry, reported it to our Java code via JNI and then it read the next one.
But there was still some room for improvement in our solution.
The C API function NIFReadEntries that we are using to read the data has a parameter to tell the API how many rows should be returned, as long as the provided buffer size is sufficient (we let Notes choose the buffer size for now). Until now, we were using a fixed value of 1. In the next version, this parameter is configurable.
No time for coffee anymore :-)
So here are the results, all for reading 15.000 view entries via the C API on a server database with different NIFReadEntries parameter values.
Again, this is no fake. The numbers are absolutely reproducible:
Maximum entries to read in NIFReadEntries | Duration |
1 | 433337 ms |
10 | 44801 ms |
100 | 5960 ms |
1000 | 2552 ms |
We are down from 7 minutes to 2,5 seconds just by changing one single parameter in the NIFReadEntries call!
Performance gain for local environment
Reading more than one line at a time also speeds up the already fast local access times:
Maximum entries to read in NIFReadEntries | Duration |
1 | 968 ms |
10 | 606 ms |
100 | 587 ms |
1000 | 573 ms |
Reducing CPU load
We did some tests to search for memory leaks in our code, in which we read the view about 1000 times. We found out that the CPU load can be reduced from 50% to 5-8% just by doing a
Thread.sleep(1)
every 200 rows in the Java code. This increased the execution time from 0,5s to 1s. Conclusion
We could not believe our eyes when we saw those results. How on earth can reading a view be 99,7% (335 times) faster using C instead of Notes.jar? 14 minutes used for ECL checks and synchronize blocks? That's impossible.
- Comments [12]