StokeBloke.com

Archive for the ‘Programming’ Category

GNU java mail

Saturday, April 5th, 2008

Heres how I started to use GNU java mail to read mbox files.

Download archives

See http://www.gnu.org/software/classpathx/javamail/javamail.html#download

Extract the archives.

Build activation

cd activation-1.1.1
./configure &&  make && make javadoc

Build inetlib

cd inetlib-1.1.1
./configure &&  make && make javadoc

Build mail

First I copied the two dependant jars into the mail directory. This makes it easy to know what version was used for the build.

cp activation-1.1.1/activation.jar mail-1.1.2/
cp inetlib-1.1.1/netlib.jar mail-1.1.2/
cd mail-1.1.2/
./configure --with-activation-jar=./ --with-inetlib-jar=./ make && make javadoc

Using it

final Properties properties = new Properties();
properties.put("mail.mbox.mailhome", "/home/nwightma/java/mboxparser/");
properties.put("mail.mbox.inbox", "");

final Session session = Session.getInstance(properties);
// protocol=mbox;
// type=store;
// class=gnu.mail.providers.mbox.MboxStore;
// vendor=dog@gnu.org;
session.addProvider(new Provider(Provider.Type.STORE, "mbox",
"gnu.mail.providers.mbox.MboxStore", "dog@gnu.org", "1"));

final Store store = session.getStore("mbox");
if (store instanceof MboxStore) {
  store.connect();

  final Folder folder = store.getFolder("test.mbox");
  folder.open(Folder.READ_ONLY);
  System.out.println(folder.getFullName());
  //final int[] msgs =  { 0 };
  //System.out.println(folder.getMessages(msgs));
}

Opinion

Well firstly the mbox gnu mail stuff does work. Just don’t attempt to use it on any large mbox files.

In my case when I tried to open an mbox with 3597 emails it took over 6 minutes just to return the message count. It appears that the code loads all the messages into to memory so if you only want to process a couple you have to pay the hit to load them all.

Finding number of messages with grep take…

time grep -c "^From " test.mbox
3597

real	0m0.542s
user	0m0.403s
sys	0m0.138s

It appears that the folder.open(Folder.READ_ONLY) is the part which takes forever. Well to be exact the open call takes 359.952 seconds ~ 6 mins. After that thought the calls to get messages is very fast. I believe its loading the complete mbox into memory when you open the folder.

I believe that the mbox provider from gnu mail is fine if you want to process every part of the mbox, but if you want to use just some messages or some parts of these the overhead is simply not worth it.

emacs css-mode indent-buffer fix

Friday, March 21st, 2008

Ive been using emacs for some years now, but I always noticed the css mode seemed to format it a little strange.

I found this article, it fixes all the issues in the css mode.

(setq cssm-indent-level 4)
(setq cssm-newline-before-closing-bracket t)
(setq cssm-indent-function #'cssm-c-style-indenter)
(setq cssm-mirror-mode nil)

Now when I auto indent the buffer it looks correctly. I have no idea why this is not the default for the css-mode.

FYI my auto indent key is F2.

(defun indent-buffer ()
    (interactive)
    (save-excursion (indent-region (point-min) (point-max) nil))
)
(global-set-key [f2] 'indent-buffer)

Migrated

Tuesday, March 18th, 2008

I finally got around to moving away from simplephpblog. It was so hacked and customised I could no longer upgrade.

I have moved to wordpress. It feels more professional, but its going to take some time to get used too it.

I have moved most of the posts from 2008, but all the old ones have been deleted.

Java File deleteOnExit memory leak

Monday, December 31st, 2007

Well I would never have ever expected to find such a strange memory leak in my program.

The program ran for a long time and called the following code

 final File tempfile = File.createTempFile("tmp", ".dat");

tempfile.deleteOnExit();

I then deleted the file in a normal case using

tempfile.delete()

You would never think this would leak memory but it does. I simply called deleteOnExit() to ensure anything I missed was cleaned up. The program didnt rely on it.

The reason that information is kept in memory every time the deleteOnExit call is made, this used to be in the native code, but its now in the java code, is so the files can be deleted. In the java code the absolute path is stored in a HashMap. When a file is deleted the entry still remains in this HashMap.

Simply put, dont use deleteOnExit() if your program should run for a long time.

See http://java.sun.com/j2se/1.4.2/docs/api/java/io/File.html
See http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4513817

It also appears other people are seeing this issue too http://www.jroller.com/javabean/entry/solving_an_outofmemoryerror_java_6